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FOREWORD 


The  seventh  'Aha  Huliko'a^  Hawaiian  Winter  Workshop  was  held  January  12-5,  1993  at 
the  East-West  Center  in  Honolulu,  Hawaii.  The  topic  was  “Statistical  Methods  in  Physical 
Oceanography.” 

Physical  oceanographers  deal  with  randomness  and  uncertainties  when  analyzing  ocean 
data  and  formulating  ocean  models.  They  apply  concepts  and  results  from  probability 
theory,  statistical  inference  and  stochastic  processes.  The  size  and  complexity  of 
oceanographic  problems  often  prevent  the  application  of  standard  methods,  and  physical 
oceanographers  are  faced  with  the  task  of  inventing  special  methods  that  deal  with  the 
peculiarities  of  their  problems  in  a  sensible  way.  These  special  methods  were  the  object  of 
the  workshop’s  lectures  and  discussions.  The  lectures  are  published  in  these  proceeding. 
The  order  of  the  papers  follows  loosely  the  agenda  of  the  workshop  covering  a  variety  of 
oceanographic  observations,  methods  for  efficient  flow  and  data  representation, 
ft^uentist  versus  Bayesian  inference,  data  assimilation,  and  idealized  dynamics.  Also 
included  is  a  summary  of  the  meeting. 

The  workshop,  made  possible  by  a  grant  from  the  U.S.  Office  of  Naval  Research,  was 
hosted  by  the  Department  of  Oceanography  of  the  School  of  Ocean  and  Earth  Science  and 
Technology  of  the  University  of  Hawaii.  The  excellent  facilities  of  the  East-West  Center 
and  the  capable  staff  directed  by  James  McMahon  contributed  greatly  to  the  success  of  the 
meeting.  The  local  organization  and  logistical  arrangements  were  expertly  handled  by 
Phyllis  Haines.  This  proceedings  volume  came  into  existence  through  the  creative  and 
dedicated  research  of  the  scientists  who  gathered  in  Hawaii  and  provided  the  articles  that 
follow.  Barbara  Jones  and  May  Izumi  provided  skillful  production  assistance. 
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‘Aha  Huliko'a  is  a  Hawaiian  phrase  meaning  an  assembly  that  seeks  into  the  depth  of  a  matter. 
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MEASUREMENT  AND  ANALYSIS 

OF  THE  ENERGY-CONTAINING  EDDIES 

OF  TURBULENT  FLOWS  IN  THE  COASTAL  OCEAN 

Ann  E.  Gargett 

Institute  of  Ocean  Sciences,  Sidney,  B.C.  Canada 
ABSTRACT 

Acoustic  remote  sensing  techniques  now  allow  measurement  of  the  three-dimensional 
velocity  field  associated  with  the  large-scale  eddies  of  turbulent  geophysical  flows  in  the 
coastal  ocean.  Such  techniques,  continuous  in  time  and  requiring  a  minimum  of  technical 
supervision,  are  essential  for  assessment  of  turbulent  coastal  regimes,  because  of  short 
space  and  time  scales  of  variability.  Algorithms  under  development  should  provide 
estimates  of  kinetic  energy  E,  length  scales,  and  kinetic  energy  dissipation  rate  €  of  the 
turbulence,  as  well  as  the  shear  dU/dz  of  the  mean  flow.  Recent  addition  of  a  towed  CTD 
allows  a  direct  measurement  of  buoyancy  flux  p*w  ,  a  major  goal  of  ocean  microscale 

measurements  over  the  last  two  decades.  Preliminary  data  are  available  to  compare  this 
direct  measurement  with  the  widely  used  estimate  p’w  =  0.2po^"‘e,  made  from 
measurements  of  dissipation  rate. 

1.  AN  ACOUSTIC  REMOTE  SENSING  TOOL  FOR  TURBULENCE  RESEARCH 

While  shipbome  acoustic  Doppler  current  profilers  (ADCPs)  have  been  widely  used  for 
measuring  "mean"  currents  in  the  surface  layers  of  the  ocean,  use  of  a  commercial  ADCP 
for  turbulence  research  required  modification  to  both  hardware  and  software.  The 
hardware  modification  was  to  rotate  one  of  the  four  beams  of  a  standard  Janus- 
configuration  transducer  head  to  vertical,  leaving  the  other  three  beams  at  the  normal 
(30**)  slant  angle  from  vertical.  When  mounted  on  a  ship  (Fig.  1),  this  beam  (B3)  is  closely 
adjusted  to  vertical  (±0.S‘’),  allowing  a  direct  and  unequivocal  measurement  of  vertical 
velocity  w.  A  combination  of  B4  and  B3  provides  an  estimate  of  across-ship  velocity 
component  u,  while  a  combination  of  B2  and  B3  (or  of  B1  and  B3)  provides  the  along- 
ship  component  v.  These  horizontal  velocity  components  can  be  affected  by  the  slant-beam 
configuration,  so  this  account  will  mostly  use  the  straightforward  measurement  of  w. 

Direct  shipbome  measurement  of  w  is  possible  A\dth  incoherent  Doppler  systems  because 
coastal  turbulence  is  vigorous,  and  because  the  inner  coastal  waters  of  British  Columbia, 
in  which  these  data  were  taken,  provide  low  levels  of  platform  motion  contamination.  If  a 
stable  platform  can  be  provided,  however,  recent  development  of  more  accurate  coded- 
pulse  Doppler  systems  suggests  that  the  techniques  discussed  here  will  soon  be  extensible 
to  the  deep  ocean. 
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(a) 


Figure  1.  (a)  Hardware  modification 
to  standard  ADCP:  one  beam  (B3) 
has  been  rotated  to  vertical,  (b) 
Acoustic  beam  orientation  relative  to 
ship-based  coordinates  (x,  y,  z). 


Special  acquisition  software  was  written  to  allow  recording  of  raw  (single-ping)  beam 
velocities  and  acoustic  amplitudes.  After  each  ping,  the  processor  associated  with  a  single 
beam  returns  time  (radial  distance)-binned  estimates  of  radial  velocity,  defined  positive 
when  the  velocity  is  towards  the  transducer,  and  a  measure  of  the  strength  of  the  return 
signal.  In  acquisition  mode,  both  fields  are  recorded  for  all  four  beams,  while  up  to  four 
fields  can  be  selected  for  colour-coding  and  real-time  display.  At  present,  we  use 
amplitude  signal  only  from  the  vertical  beam,  in  order  to  locate  the  bottom  (or  lack  of  it) 
in  the  velocity  records;  subsequent  processing  uses  only  the  water  column  velocities. 

Single-ping  velocity  data  are  noisy.  Figure  2a  is  a  (poor)  rendition  of  raw  data  from  a 
turbulent  tidal  front.  [An  apology;  Grey-scale  rendering  of  signed  quantities  such  as 
velocity  is  difficult,  but  must  be  attempted  when  colour  graphics  are  not  available.  For 
presentation  in  this  paper,  I  have  chosen  to  bin  the  data  very  coarsely,  effectively  grey¬ 
scale  'contouring'  the  fields.  With  such  coarse-binning,  it  is  possible  to  use  a  symmetric 
grey-scale  that  differs  only  in  the  textures  assigned  to  the  bins  nearest  zero  (center);  thus 
in  Figure  2,  a  maximum  (black)  that  occurs  as  a  progression  through  light  grey  (small 
circles)  is  a  maximum  downwards  (upwards)  w.  While  this  presentation  works  reasonably 
well  with  smooth  fields,  it  does  a  very  poor  job  of  the  original  noisy  raw  data  in  Figure 
2a.]  The  standard  technique  for  reducing  the  noise  level  of  Doppler  velocity  estimates  is  to 
average  values  from  consecutive  pings;  Figure  2b  illustrates  this  technique,  using  an 
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2.  WHY  DO  WE  NEED  THE  VERTICAL  BEAM? 

While  significant  vertical  velocities  do  not  guarantee  that  a  flow  is  turbulent,  flows  are  not 
turbulent  without  significant  vertical  velocities.  In  survey  mode,  we  may  thus  look  for 
large  vertical  velocity  as  a  necessary  condition  for  turbulence.  Having  found  this 
condition,  such  flows  may  be  subject  to  more  rigorous  scrutiny  with  regard  to 
characteristics — for  example  relative  "eddy"  and  internal  wave  time  scales,  vertical 
buoyancy  flux,  phase  between  w  and  fluctuation  density — which  we  associate  with 
turbulence.  Thus  accurate  measurement  of  the  vertical  velocity  field  is  essential  to 
turbulence  measurement. 

With  a  standard  ADCP,  velocity  components  are  calculated  under  the  assumption  that  the 
velocity  field  is  uniform  over  the  spread  of  slant  beam  pairs  (Fig.  3a).  If  this  is  the  case,  the 
horizontal  component  v  in  the  plane  of  BI  and  B2  makes  contributions  of  opposite  sign  to 
the  beam  velocities  VI  and  V2  in  bin  b;  hence  slant  beam  vertical  velocity 
ws  =  (Vl+V2)/2  cos  30°.  This  slant-beam  vertical  velocity  is  shown  in  Figure  3c,  below  the 
field  of  w  measured  directly  by  the  vertical  beam  (Fig.  3b)  for  a  section  of  data  from  a  tidal 
front.  The  obvious  differences  between  w  and  ws  are  caused  by  the  fact  that  the  turbulent 
field  has  spatial  scales  that  are  comparable  to  the  slant  beam  spread. 

Scatter  plots  of  vs  w  (Fig.  4)  show  that  while  ==  w  at  shallow  depths  (a),  the 
correlation  decreases  with  increasing  depth  (b);  By  the  deepest  bins  (c),  ws  is  essentially 
uncorrelated  with  w,  although  both  remain  significantly  above  the  noise  level,  shown  in 
(d).  This  must  be  expected  to  be  a  normal  state  of  affairs  in  coastal  waters,  where  the 
water  depth  H  sets  a  maximum  outer  scale  for  turbulent  eddies  (the  actual  outer  scale  may 
be  even  smaller,  because  of  conditions  of  shear  or  stratification).  With  the  30°  angle  of  the 
standard  slant  beam  pairs,  slant  beam  separation  at  depth  H  is  H,  i.e.,  the  scale  at  or  below 
which  we  expect  turbulent  energy  to  reside.  Accurate  measurement  of  the  vertical  velocity 
field  in  coastal  areas  thus  clearly  requires  the  special  vertical  beam  that  is  part  of  the 
DOppler  Turbulence  system  (DOT). 
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Figure  3.  (a)  Accurate  determination  of  w  from  two  slant  beams  (B1  and  B2)  requires  that  the  velocity 
field  be  uniform  over  the  (increasing  with  depth)  horizontal  spread  between  the  beams.  Fields  of  (b)  w  from 
B3  and  (c)  ws  from  B1  and  B2  differ  considerably  in  this  tidal  front,  suggesting  that  this  requirement  is  not 
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W  (Cftl/S)  W  (Cffl/S) 


Figure  4.  Scatter  plots  of  wf, 
vertical  velocity  determined 
from  the  paired  slant  beams  B1 
and  B2,  versus  the  “true”  w 
measured  from  B3,  for  various 
depths  (a)  23  m,  (b)  1 12  m,  and 
(c)  201  m;  (d)  is  noise  level, 
taken  at  slack  tide  in  a  sheltered 
location.  Near  the  transducer, 
the  two  variables  are  correlated, 
but  as  depth  (slant  beam 
separation)  increases,  ws  and  w 
become  increasingly 
uncorrelated. 


3.  AN  ALBUM  OF  COASTAL  MIXING 


With  the  shipborne,  semi-automated  system  described  above,  it  is  possible  to  survey 
coastal  waters  for  locations  and  processes  that  cause  significant  turbulence.  Our 
experience  is  that  most  intense  turbulence  is  associated  in  some  way  with  flow  geometry 
such  as  submarine  sills,  horizontal  channel  constriction,  or  sharp  changes  in  channel 
direction.  Coastal  turbulence  varies  rapidly  in  time,  since  it  is  driven  predominantly  by  the 
tides  and  is  clearly  modulated  on  the  neap/spring  cycle. 

Figure  5  is  a  sampler  of  the  kind  of  mixing  regimes  found  in  B  .C.  coastal  waters.  The 
depth  range  of  the  measurements  vary,  as  marked;  the  horizontal  scale  is  ~1 100  m.  In  the 
upper  panel  (a)  is  a  record  taken  in  mid-winter  at  a  time  of  minimum  water  column 
stratification.  The  tide  floods  from  left  to  right  over  a  sharp  submarine  sill  that  nearly 
blocks  a  tidal  channel  located  in  the  southern  Strait  of  Georgia.  Water  descends  the 
downstream  side  of  the  sill  with  vertical  velocity  near  1  m/s;  the  subsequent  flow  exhibits 
intense  fluctuations  of  vertical  velocity  far  downstream.  The  centre  panel  (b)  is  another 
situation  in  which  the  tide  floods  from  left  to  right  across  a  sill;  this  however  is  a  silled, 
^ord-type  inlet,  at  a  time  of  very  strong  near-surface  density  stratification.  Whether 
because  of  this  stratification  "cap"  or  because  of  the  gentler  sill  relief,  dense  water  from 
outside  the  sill  is  found  entering  the  inlet  on  the  flood  as  a  bottom  boundary  current,  most 
visible  in  the  vertical  velocity  field  at  those  places  where  it  accelerates  downwards  with 
increases  in  bottom  slope.  A  final  example  in  Figure  5c  shows  a  turbulent  surface  jet 
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flowing  (left  to  right)  out  of  a  narrow  and  shallow  tidal  passage.  Water  exiting  the  passage 
is  well-mixed  and  lighter  than  the  deeper  water  outside,  hence  flows  out  at  the  surface. 
Abrupt  increase  in  channel  width  causes  rapid  shallowing  of  the  jet  just  outside  the  channel 
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4.  ESTIMATION  OF  TURBULENCE  QUANTITIES 

What  properties  of  turbulence  would  we  like  to  know?  -  turbulent  kinetic  energy  E ,  the 
rate  €  at  which  it  is  being  dissipated,  and  the  associated  vertical  fluxes  of  mass  and 
momentum,  are  some  that  spring  to  mind.  The  DOT  system,  augmented  by  sporadic 
vertical  profiles  of  density,  should  offer  information  in  nearly  all  of  these  areas. 

Turbulent  kinetic  energy; 


The  definition  of  turbulent  kinetic  energy  per  unit  mass  as  £  =  1/2  (u^  +  v^  +  w^)  uses  the 
components  (u,  v,  w)  of  the  turbulent  velocity  u,  itself  defined  as  the  (zero>mean)  part  left 
after  removal  of  a  "mean"  velocity  i/=  (U,  Vy  W=0)  (where  U  and  V  are  normally 
assumed  to  be  functions  of  z  only)  from  the  total  velocity  Uj.  Inherent  in  this  so-called 
Reynolds  decomposition  of  the  flow  is  an  appropriate  definition  of  the  averaging  process 
that  defines  the  "mean"  flow.  While  the  assumption  that  fV=0  seems  safe,  it  is  difficult  to 
decide  how  to  form  a  “mean"  horizontal  component  in  situations  where  the  flow  is 
substantially  inhomogeneous.  The  problem  is  illustrated  in  the  record  of  Figure  6  which 
shows  (a)  the  horizontal  velocity  component  v  (relative  to  the  ship)  along  the  axis  of  a 
tidal  channel  and  (b)  vb,  the  baroclinic  part  of  this  field,  formed  by  removing  the  local 
depth-average  of  v.  At  the  beginning  (left)  of  this  record,  vb  has  a  three-layer  structure, 
with  surface  and  bottom  layers  moving  more  rapidly  than  a  mid-depth  layer.  By  the  end 
(right)  of  this  section  of  record,  the  structure  had  changed  to  bottom-intensified  two-layer 
flow.  It  is  not  at  all  clear  what  horizontal  scale  should  be  chosen  for  calculating  a  "mean" 
horizontal  velocity  component  V,  nor  how  that  scale  should  change  with  time  (horizontal 
distance). 

Because  of  this  uncertainty  as  to  the  appropriate  averaging  for  the  horizontal  "mean" 
components,  the  cleanest  definition  of  E  would  seem  to  be  Ej  =  3/2  (w^),  where  the 
overbar  denotes  an  averaging  length  such  that  w  =  0,  and  the  subscript  is  a  reminder  that 
this  is  an  isotropic  estimate,  obtained  from  the  vertical  velocity  component  only. 

Turbulent  kinetic  energy  dissipation  rate  e; 

Also  of  interest  is  the  rate  at  which  mean  flow  energy  is  being  removed  to  dissipation 
scales  by  the  action  of  the  turbulence.  The  possibility  of  remote  measurement  of  this 
quantity  has  its  roots  in  the  work  of  Batchelor  and  Townsend  (1948),  who  showed  that 
the  large  scale  eddies  of  turbulence  lose  their  energy  to  the  turbulent  energy  cascade 
(Kolmogoroff,  1941)  within  at  most  a  few  eddy  turnover  times.  Since  energy  that  enters 
the  cascade  is  delivered  to  dissipation  scales,  this  means  that 
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(a)  -2S0 

(b) -80 
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Figure  6.  (a)  Field  of  horizontal  velocity  v'  relative  to  the  ship  (determined  from  the  fore-aft  slant  beam 
pair  B1,B2)  as  the  ship  moves  along  the  axis  of  a  tidal  channel.  Variations  in  ship  speed  and/or  the 
barotropic  field  are  removed  in  (b)  the  baroclinic  field  vb  =  v-  <v>  where  <v>  is  the  (local)  depth- 
averaged  value.  The  strongly  inhomogeneous  nature  of  the  horizontal  fiow  makes  calculation  of  horizontal 
turbulent  velocity  components  difficult. 

where  x  -  ilrw  is  the  turnover  time  of  an  eddy  of  scale  f  and  rms  turbulent  vertical 
velocity  rw.  This  is  only  a  scale  relationship,  leaving  an  unknown  constant  to  be 
determined.  Direct  measurements  of  e,  n*',  and  f  from  the  atmospheric  boundary  layer 
have  confirmed  the  relation  (1)  above,  and  suggest  that  the  constant  involved  is  between  3 
and  5  (Wamser  and  Muller,  1977). 


Thus  for  both  £i  and  e  estimates,  it  is  necessary  to  derive  values  for  rw,  an  rms  velocity 
typical  of  the  energy-containing  eddies  of  the  turbulent  field;  for  e,  we  need  in  addition  a 
value  for  the  characteristic  length  scale  of  such  eddies.  Meteorologists  identify  the 
turbulent  length  scale  i  as  the  location  of  the  peak  of  a  spectrum  of  vertical  velocity  as  a 
function  of  horizontal  wavenumber,  the  turbulent  velocity  scale  rw  as  the  square  root  of 
the  spectral  integral,  a  procedure  that  makes  sense  in  view  of  the  long  and  homogeneous 
records  that  can  be  obtained  from  meteorological  towers.  Unfortunately,  the  marked 
inhomogeneity  of  the  turbulent  fields  in  coastal  waters  means  that  “a”  wavelength  doesn't 
remain  constant  over  the  large  number  of  wavelengths  necessary  for  its  determination  by 
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remain  constant  over  the  large  number  of  wavelengths  necessary  for  its  determination  by 
such  a  Fourier  technique.  Wavelet  analysis  (Farge,  this  volume)  may  offer  a  more 
sophisticated  means  of  determining  local  wavelength  and  energy  values,  but  for  now,  I 
have  used  a  very  simple  algorithm,  shown  schematically  in  Figure  7.  The  curve  is  that  of 
vertical  velocity  w,  measured  at  constant  depth  (bin),  as  a  function  of  horizontal 

Figure  7.  Schematic 
of  a  simple  algorithm 
for  determining  local 
values  of  large- 
eddy  tuitMilence 
parameters  (half¬ 
wavelength  average 
vertical  velocity  and 
length  scale)  neected 
for  remote  estimate  of 
e:  for  details,  see 
text. 


distance  x:  horizontal  dashed  lines  denote  ±(7,  one  standard  deviation  of  the  measurement 
noise  level  about  the  zero  mean.  Starting  with  a  point  (say  that  marked  by  the  open  circle) 
where  Iw]  >  cr,  the  algorithm  searches  for  locations  of  the  nearest  preceding  and  following 
points  with  |tv|  >  a  but  of  the  opposite  sign  (respectively  P  and  F  in  Fig.  7).  The 
distance  L  between  these  points  is  taken  as  a  local  estimate  of  a  half-wavelength.  The 
average  of  w  over  L,  denoted  aw,  is  similarly  considered  to  be  the  average  of  w  over  a 
half-wavelength.  One  then  moves  to  point  F  and  repeats  the  process,  resulting  in  new 
estimates  L  'and  aw These  local  estimates  are  assigned  to  the  region  over  which  they  are 
calculated;  in  the  (usually  small)  regions  of  overlap,  the  first  (in  space/time)  estimates  are 
arbitrarily  chosen.  Figure  8b  shows  the  field  of  aw  that  results  when  this  algorithm  is 
applied  to  the  tidal  front  data  of  Figure  8a. 

Assuming  that  the  othet  i  .iF- wavelength  exists  (although  not  necessarily  in  the  plane  of 
measurements),  the  values  of  aw  are  converted  to  a  corresponding  root-mean-square  value 
(nv)  by  the  scaling  factor  (1.11)  apompriate  for  a  pure  sinusoid,  then  used  with  the  length 

scale  estimate  {  =  2L  (not  shown)  to  form  the  estimate  of  e,e2=  1 1 . 1 1  aH'p/2Z,,  which 
is  shown  in  logarithmic  form  in  Figure  8c.  Note  that  this  estimate  of  rw  can  also  be  used  in 
the  estimate  E-  =  3l2{w'^)  =  3l2tW  of  turbulent  kinetic  energy. 

How  much  one  may  trust  such  an  estimate  of  e  can  be  dCierminp  ’  by  comparing  it  with 
values  determined  directly,  by  integration  of  the  spectiam  of  small-scale  shear  measured  in 
situ.  Vertical  profiles  of  such  direc‘ ;  ’casu’'ements  of  e  were  taken  at  the  two  locations 
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Figure  8.  (a)  Measured  field  of  w  in  a  tidal  front,  (b)  Associated  field  of  aw  derived  using  the  algorithm 
depicted  in  Figure  7;  aw  and  L  (not  shown)  can  be  used  to  form  a  field  of  el,  estimated  turbulent  kinetic 
energy  dissipation  rate,  shown  as  log(e2)  in  the  grey-scale  presentation  of  (c).  The  vertical  lines  in  (c) 
denote  the  launch  times  of  a  turbulence  microprofiier  (operated  by  Dr.  J.  Mourn,  Oregon  State  University) 
making  direct  measurements  of  6;  maximum  profile  depths  are  marked  by  arrows. 
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nuirked  in  Figure  8c  by  Jim  Mourn  of  Oregon  State  University.  Figure  9(a,b)  compares 
the  direct  profiler  measurement  (log  € ,  light  line)  with  the  indirect  estimate  log  (el)  for 
each  profile.  The  heavy  line  is  the  logarithm  of  the  average  value  of  e2  over  ±  10  pings 
surrounding  the  launch  of  the  profiler;  the  points  give  some  idea  of  the  spread  of 
individual  estimates  within  these  21  pings.  The  agreement  between  the  two  estimates  is 
remarkably  good  for  profile  6S.  In  the  subsequent  profile,  which  went  somewhat  closer  to 
the  bottom  (about  300  m  at  both  profile  locations),  we  see  a  defect  which  tends  to  recur  in 
many  such  comparisons;  namely  a  tendency  for  e2  to  underestimate  e  near  both  the 
surface  and  bottom  boundaries  of  the  flow.  This  may  indicate  the  need  to  modify  the 
definition  of  turbulent  length  scale  t  Hunt,  Stretch  and  Britter  (1988)  suggest  an  alternate 
form,  which  tends  toward  the  type  of  internal  scale  determined  here  when  the  flow  is  far 
from  boundaries  but  toward  the  distance  z  to  the  nearest  boundary  when  z  is  less  than  this 
inner  scale.  Indeed,  in  measurements  taken  in  the  ocean  surface  layer,  Agrawal  and  Hwang 
(1991)  demonstrate  good  correspondence  between  directly  measured  e  and  inv)^/i,  with 
i  =  z.  Such  a  modification  to  £,  causing  length  scales  to  decrease,  hence  e2  to  increase 
near  boundaries,  would  act  to  correct  the  discrepancies  seen  in  Figure  9b. 

As  shown  in  Figure  9c,  however,  there  are  profiles  in  which  there  remain  very  large  and 
unsystematic  differences  between  direct  measurements  and  indirect  estimates.  Indeed, 
given  the  high  turbulent  intensities  and  spatial/temporal  inhomogeneities  characteristic  of 
these  flows,  this  seems  scarcely  surprising.  Consider  that  the  profiler  is  launched  from  the 
stem  of  the  ship,  at  which  time  and  location  the  w  field  is  assumed  "known"  from  the 
Doppler.  Thereafter  the  profiler,  falling  verrically,  can  be  advected  horizontally  by  the 
local  ambient  flow,  so  does  not  necessarily  remain  at  this  geographic  launch  position.  Even 
if  it  were  to  remain  there,  the  flow  field  may  change  in  the  time  taken  for  the  profile 
(typically  4-5  minutes  for  a  profile  to  300  m).  Various  checks  for  the  likelihood  of  time 
change  can  be  devised,  using  the  fact  that  the  fore/aft  slant  beams  allow  two 
measurements  of  v  that  are  separated  in  time,  but  this  is  merely  an  effort  to  avoid  a 
statistical  problem,  that  of  estimating  the  degree  of  agreement  (or  disagreement)  necessary 
before  a  remote  measurement  of  a  non-stationary  and  inhomogeneous  field  can  be 
considered  "proven"  by  a  relatively  sparse  set  of  ground-truth  measurements. 

Vertical  buoyancy  (mass)  flux; 


Part  of  the  reason  one  might  like  a  remote  technique  for  e  is  because  for  the  last  decade, 
oceanographers  have  obtained  what  they  often  really  wanty  the  vertical  buoyancy  flux  p'w, 
from  what  they  are  able  to  get  from  microstructure  profiler  measurements,  namely  e,  and  a 
model  (Osborn,  1980)  which  suggests  that  under  certiun  assumptions. 


_  R,  , 


(2) 
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logc(cni2/s^)  logt(cni2/s^) 


Figure  9.  Comparison  of  log(e)  from  direct  profiler  measurements  (dashed  lines)  with  the  indirect 
estimates  log(e2)  (solid  lines)  derived  from  the  Doppler  w  field.  The  Doppler  estimates  are  averaged  over 
±20  pings  about  the  launch  position  of  the  profiler  (see  Figure  8);  the  individual  points  give  some  idea  of 
the  variation  in  the  estimates  averaged.  Profile  65  (a)  shows  remarkably  good  agreement,  while  Profile  66 
(b)  shows  differences  near  top  and  bottom  boundaries  which  suggest  that  the  definition  of  t  may 
need  modification  in  these  regions.  There  remain  profiles  (c)  in  which  agreement  is  low. 


where  ly,  the  flux  Richardson  number,  is  the  ratio  of  buoyancy  sink  to  shear  source  terms 
in  the  turbulent  kinetic  energy  equation.  Oceanographers  add  the  further  assumption  that 
0.2,  resulting  in  an  estimate  of  buoyancy  flux  as  a  constant  fraction  of  the  measured 
turbulent  kinetic  energy  dissipation  rate  €.  If  correct,  this  model  means  that  a  remote 
measurement  of  £  would  correspond  to  a  remote  measurement  of  buoyancy  flux. 

However,  the  model  has  rarely  been  checked  by  comparison  with  direct  flux 
measurements,  as  these  are  extremely  difficult  to  make  in  the  ocean  environment.  The 
small  amount  of  evidence  which  does  exist  (Yamazaki  and  Osborn,  1993)  suggests  that  ^ 
is  either  not  constant,  or  else  considerably  smaller  than  0.2.  Vertical  turbulent  fluxes  (or 
equivalently,  turbulent  difflisivities)  are  important  products  of  oceanic  microstructure 
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measurements;  it  would  be  nice  to  know  the  circumstances  (if  any)  under  which  such 
dissipation-based  estimates  are  accurate,  hence  remote  measurement  of  buoyancy  flux 
would  be  possible. 


With  the  addition  of  a  towed  CTD  with  fmescale  resolution  (Ocean  Sensors),  it  has 
proven  possible  to  make  statistically  significant  measurements  of  buoyancy  flux  using  the 
DOT  system.  The  CTD  is  towed  at  constant  depth,  just  in  front  of  the  vertical  beam  of  the 
Doppler,  for  long  periods.  Figure  10  shows  the  CTD  measurement  of  density,  and  the 
associated  time  series  of  w  measured  in  the  Doppler  bin  that  includes  the  CTD  tow  depth, 
over  about  three  hours.  Below  is  an  enlargement  of  a  small  section  of  the  record  (taking 
care  to  preserve  phase,  the  density  field  has  been  high-pass  filtered  to  remove  the  very 
largest  scales  of  variation  in  water  properties).  Buoyancy  flux  will  be  a  positive  quantity  if 
on  average  downward(upward)  vertical  velocities  carry  lighter(heavier)  water. 
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Figure  10.  The  top  panel  shows 
time  series  of  CTD  density 
(light  line),  along  with  w  (dark 
line)  from  the  Doppler  bin 
within  which  the 
CTD  was  lowed.  Before 
calculating  fluxes,  the  density 
time  series  is  high-pass  filtered 
(preserving  phase)  to  remove 
the  variance  associated 
with  large-scale  water  mass 
change:  the  enlargement  shows 
filtered  density  and  w  over  one 
of  the  interval  lengths  used  in 
the  flux  calculation. 


Figure  1 1  shows  the  direct  flux  estimates,  formed  by  breaking  the  p'w  records  into  pieces 
of  fixed  length=spts,  then  forming  (p'-^)(w- vP)  where  the  average  is  over  spts.  Error 

bars  are  calculated  from  the  variance  of  such  estimates  over  the  number(spts)  of  different 
starting  points,  and  an  estimate  of  the  number  of  independent  values  determined  from  the 
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number  of  zero-crossings  of  w.  The  points  in  Figure  1 1  are  the  accompanying  estimates  of 
the  buoyancy  flux  made  using  (2)  above  with  Ky/(\-Ky^  =  0.2  (e  values  were  taken  from 
the  Oregon  State  profiler  measurements  over  a  range  of  6  m  centered  on  the  CTD  tow 
depth;  courtesy  of  Jim  Mourn).  While  there  is  encouraging  general  agreement,  i.e.,  values 
tend  to  be  high  where  the  direct  flux  measurement  is  large  and  positive,  low  when  the 
direct  measurement  is  not  statistically  different  form  zero,  we  face  (again)  the  problem  of 
how  best  to  average  "point"  estimates  from  the  profiler  for  comparison  with  a  more 
broadly  based  determination  from  the  towed  measurement. 


Figure  11.  Direct  calculation  of  buoyancy  flu.\  (solid  line)  with  estimated  error  bars  (dashed  lines)  over 
consecutive  400-point  blocks  of  the  time  series  shown  in  Figure  10.  Circles  are  indirect  estimates 
of  the  flux,  using  profiler  “point”  measurements  of  e  and  the  formula 


CONCLUSIONS 

It  is  now  possible  to  make  measurements  of  the  vertical  velocity  field  in  turbulent  coastal 
flows,  using  a  modified  ADCP  system.  This  allows  us  to  site-survey  for  turbulence  and, 
once  found,  to  investigate  its  spatial  and  temporal  variability.  From  the  w  field 
measurement,  it  will  be  possible  to  estimate  turbulent  kinetic  energy  E  and  possibly  its 
dissipation  rate  e.  Addition  of  a  towed  CTD  allows  direct  measurement  of  buoyancy  flux: 
if  the  model  (2)  relating  buoyancy  flux  to  €  can  be  validated,  remote  measurement  of  e 
would  be  equivalent  to  remote  measurement  of  buoyancy  flux,  probably  the  feature  of 
turbulent  flows  that  is  of  the  greatest  importance  to  coastal  applications. 

Are  results  from  the  coastal  ocean  likely  to  be  valid  when  translated  to  offshore  oceans? 
From  the  data  presented  here,  velocities  characteristic  of  turbulence  in  the  coastal  ocean 
are  clearly  much  higher  than  those  we  expect  offshore.  However,  coastal  stratification  is 
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also  much  larger:  the  combination  makes  the  coastal  ocean  less  different  from  that 
offshore  than  one  might  think.  The  lower  offshore  signal  level  poses  some  challenges,  but 
if  a  stable  platform  can  be  provided,  the  increased  accuracy  available  with  the  newer 
coded-pulse  sonars  should  allow  this  type  of  measurement  to  be  made  offshore  as  well; 
one  foresees  applications  in  studies  of  surface  and  bottom  boundary  layers  in  particular.  It 
is  my  hope  that  the  techniques  discussed  here  will  eventually  prove  as  useful  in  the 
offshore  environment  as  they  are  in  the  coastal  ocean. 
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FINESCALE  SHEAR  AND  STRAIN  IN  THE  THERMOCLINE 
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Early  studies  of  the  temperature,  density,  and  velocity  fields  in  the  sea  were  performed 
from  a  “hydrographic”  perspective.  The  expectation  was  that  one  could  “chart  the 
^  oceans”  structurally.  The  charts,  once  drawn,  would  remain  valid.  The  tools  of 

hydrography  were  the  reversing  thermometer  and  the  Nansen  bottle.  These  yielded  a 
^  picture  of  the  ocean  interior  on  vertical  scales  of  hundreds  of  meters,  horizontal  scales  of 

tens  of  kilometers  From  very  early  on  it  was  appreciated  that  smaller  scale  phenomena 
»  were  active  in  the  ocean  interior.  Yet  it  was  difficult  to  infer  the  role  these  small  scale 

motions  played  in  maintaining  the  hydrographic  fields. 

► 

With  contemporary  sensors  far  clearer  pictures  of  the  small-scale  oceanic  fields  are 
*  emerging.  Yet  the  difficulty  in  quantifying  the  interaction  with  the  hydrographic-scale 

ocean  remains.  In  this  work  we  concentrate  on  motions  of  vertical  scale  3-50  m.  Over 
this  range,  the  scalar  fields  transition  from  highly  skewed  to  nearly  Gaussian  behavior. 

The  objective  of  this  work  is  to  quantify  this  transition  in  a  statistical  sense,  with  a 
particular  focus  on  strain,  shear  and  Richardson  number,  l{dul  dzf .  Here, 

=  g!  pdp!  dz  is  the  Vaisala  frequency  squared,  where  p  is  the  potential  density  of  the 
sea  water. 

Strain  statistics  were  investigated  in  a  previous  work  (Pinkel  and  Anderson  1992, 
henceforth  PA  92)  and  are  reviewed  here  in  section  1 .  This  previous  study  emphasized 
the  utility  of  describing  the  finescale  fields  from  the  perspective  of  “reversible  fine 
structure,”  a  term  introduced  by  Desaubies  and  Gregg  (1981).  They  argued  that  the 
extremely  intense  finescale  variability  of  passive  scalars  in  the  thermocline  results  from  the 
simple  straining  of  a  smoother  underlying  field  by  the  energetic  internal  wavefield. 
Irreversible  processes  such  as  turbulent  mixing  (Cox  et  al.  1969)  and  thermohaline 
intrusions  (Stommel  and  Federov,  1967)  typically  play  a  secondary  role.  If  one  adopts  the 
reversible  finestructure  hypothesis,  it  becomes  attractive  to  describe  variations  in  a 
coordinate  system  that  is  unaffected  by  the  finescale  straining.  Using  a  repeatedly  profiled 
CTD,  we  track  the  vertical  motion  of  a  set  of  isopycnal  surfaces.  The  time  evolution  of 
both  scalar  and  vector  fields  can  be  described  in  this  isopycnal  following  frame  (henceforth 
referred  to  as  a  semi-Lagrangian  frame),  as  well  as  in  a  conventional  Eulerian  frame. 

*Now  at  Woods  Hole  Oceanographic  Institution 
Woods  Hole,  MA  02543. 


17 


18 


PINKEL  AND  ANDERSON 


In  section  2,  shear  and  Richardson  number  statistics  are  presented  Shear  data  are 
obtained  from  a  161  kHz  coded-pulse  Doppler  sonar  mounted  on  the  Research  Platform 
FLIP.  Sonar  resolution  is  sufficient  that  the  vertical  advection  of  the  shear  field  by  the 
internal  wavefield  can  be  seen.  This  observation  encourages  the  use  of  semi-Lagrangian 
coordinates  to  describe  time  evolution  of  the  shear.  The  modeling  of  Richardson  number 
takes  on  a  different  form  in  the  semi-Lagrangian  frame  than  in  previous  Eulerian  studies, 
such  as  Desaubies  and  Smith  (1982)  or  Munk  (1981).  In  section  2  a  simple  model  is 
derived  and  compared  with  the  data.  Agreement  between  model  and  data  is  encouraging. 
A  brief  discussion  of  results  and  implications  concludes  this  work. 

1  FINESCALE  STRAIN  IN  THE  THERMOCLINE 

Strain  Measurement 

The  data  considered  for  the  strain  study  are  a  set  of  9000  CTD  profiles,  from  the  surface 
to  560  m.  These  were  obtained  during  October  1986  from  the  Research  Platform  FLIP, 
when  it  was  located  at  34°N,  127°W,  approximately  500  km  west  of  Point  Conception, 
California.  Position  was  maintained  to  within  300  m  by  a  two-point  moor.  Water  depth  at 
the  site  is  4  km. 

The  CTDs  used  are  Seabird  Instruments  model  SBE-9s.  Two  such  instruments  are 
profiled.  The  upper  unit  is  cycled  from  the  surface  to  320  m.  The  lower  system  covers 
the  depth  range  250-560  m.  Profiles  are  repeated  at  3-min.  intervals.  The  drop  rate  of  the 
sensors  is  approximately  3.5  m  s  ‘.  It  is  not  necessary  to  pump  water  tlu  ough  the 
conductivity  cell  to  achieve  adequate  spatial  resolution  at  this  drop  rate.  Following 
response  corrections  to  the  temperature  and  conductivity  sensors  (PA  92),  density  profiles 
are  produced.  A  set  of  560  isopycnals,  of  mean  separation  1  m,  is  followed  for  the 
duration  of  the  experiment. 

The  3-hour  record  presented  in  Figure  1  represents  a  small  portion  of  the  18.75-day  data 
set.  In  it  one  sees  a  general  trend  toward  decreasing  isopycnal  depth,  associated  with  the 
baroclinic  tide.  Superimposed  on  this  trend  are  higher-frequency  (1-2  cph)  internal  waves. 
These  are  extremely  coherent  with  depth.  Against  this  large-scale  background,  the 
finescale  straining  of  the  density  field  is  seen.  Isopycnals  converge  to  form  “sheets”  of 
high  vertical  gradient  and  diverge,  forming  low-gradient  “layers.”  The  typical  time  scale 
for  the  finescale  variation  appears  to  be  from  one-half  to  several  hours,  in  this  short 
record. 
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Protagonists  in  the  present  study  are 

isopycnal  displacement  Tj(t)  =  z(p,t)  -  zip), 
isopycnal  separation  Az^j(t)  =  ziPi ,t)-zipj,t), 
the  normalized  separation  y^{r)  =  Az;^(r)  /  , 

and  the  finite-difference  strain  Yijit)  =  (/)- 1. 


Figure  1 .  An  example  of  isopycnal  depth 
fluctuations  as  seen  in  the  PATCHEX  dataset. 
The  statistics  of  isopycnal  separation  are  the 
focus  of  the  present  study. 


05:00  06:00  07:00  08:00  09:00  10:00 

Pacific  Standard  Time 
Day  299,  1986 


The  Probability  Density  Functions  of  Strain 


From  the  depth-time  history  of  isopycnal  displacement,  strain  statistics  can  be  estimated  in 
two  distinct  ways.  One  can  simply  calculate  the  probability  density  functions  (pdfs)  of 
separation  between  selected  isopycnals  pairs.  This  is  the  isopycnal  following  or  “semi- 
Lagrangian”  approach.  One  can  also  monitor  the  separation  statistics  of  that  pair  of 
isopycnals  that  is  bracketing  a  fixed  reference  depth.  This  provides  an  Eulerian  view  of 
the  strain  field.  Both  Eulerian  and  semi-Lagrangian  pdfs  have  been  calculated  from  the 
SWAPP  data  set.  To  investigate  possible  depth  variability  of  the  strain  field,  separate  pdfs 
are  formed  for  discrete  100-m  depth  regions:  100-200  through  400-500  m.  Density 
functions  for  the  200-300  m  region  are  presented  in  Figures  2  and  3,  for  mean  isopycnal 
separations  of  1-10  m.  Each  pdf  is  formed  from  9  xlO*  data,  sorted  into  100  bins.  The 
data  are  not,  however,  mutually  independent.  Careful  analysis  (PA  92)  suggests  that  there 
are  between  50  and  90  independent  estimates  per  bin  in  a  typical  histogram.  Sample  pdfs 
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Figure  2.  Probability  density  functions  of 
normalized  separations,  7,  formed  in  a  semi' 
Lagrangian  frame,  for  mean  isopycnal  separations 
1-10  m.  Dotted  lines  give  model  Gamma  pdfs, 
constrained  to  have  unity  mean  and  the  observed 
variance.  Data  from  200-300  m  depth  are 
presented. 


Eulortan 

200-300m 


Figure  3.  Probability  density  functions  of 
normalized  separation  7,  as  in  Figure  2  except 
formed  in  an  Eulerian  frame.  Dotted  lines  give 
model  Gamma  pdfs,  constrained  to  have  mean 
and  variance  identical  to  the  observations.  Data 
from  200-300  m  depth  are  presented. 


have  been  formed  for  mean  separations  as  great  as  50  m.  While  these  appear  nearly 
Gaussian  at  scales  greater  than  10  m,  skewnesa  and  kurtosis  estimates  are  significant  to 
separations  of  order  30  m  (Fig.  4). 

The  observed  pdfs  have  been  fit  to  a  variety  of  classical  forms,  including  Rayleigh, 
Weibull,  Lognormal  and  Gamma.  Significant  discrepancies  are  subjectively  apparent  in  all 
comparisons,  with  the  notable  exception  of  the  Gamma  pdf,  which  fits  very  well  (Figs. 
2,3).  The  Gamma  pdf  has  the  form 

X  ^ 
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with  mean  <x>  =  a/fi  and  variance  =  a/ ^  (Papoulis  1984). 

The  semi-Lagrangian  data  are  constrained  to  have  <y)  =  l,(Az)  =  Az,  by  initial  choice  of 
isopycnals.  Hence,  a  =  pAz.  The  fits  presented  in  Figure  2  are  thus  one-parameter  fits, 
with  sample  variance  matched  to  the  model  variance.  The  Eulerian  observations  are  not 
constrained  to  unity  mean.  These  require  two-parameter  fits.  The  observed  mean  and 
variance  are  used  to  set  model  pdf  parameters  in  Figure  3. 

The  Gamma  pdf  is  seen  to  fit  the  observations  well  in  the  200-300  m  depth  range,  except 
at  separations  less  than  4  m.  The  fits  are  comparable  in  the  other  depth  ranges,  with  the 
exception  of  the  300-400  m  interval,  where  the  Lagrangian  pdfs  appear  distorted  at 


Figure  4.  (a)  Skew¬ 
ness  and  (b)  kurtosis 
as  a  function  of  bin 
size  for  the  Eulerian 
probability  functions 
(Fig.  3).  (c) 
Skewness  and 
(d)  kurtosis  as  a 
function  of  mean 
separation  for  the 
probability  density 
functions  of 
isopycnal  separation 
(Fig.  2). 


22 


PINKEL  AND  ANDERSON 


small  7,  over  a  range  of  Az  =  3-7  m  (PA  92).  The  fits  could  be  improved  by  employing  a 
least-squares  fitting  procedure.  Optimizing  the  fit,  however,  is  not  the  point  of  the  present 
exercise. 

A  Statistical  Model  of  Fine  scale  Strain 

Gamma  pdfs  are  associated  with  the  classical  theory  of  Poisson  processes.  They  describe 
the  statistics  of  separation  between  the  occurrence  of  Poisson  “events”  (Papoulis  1984). 

If,  indeed,  simple  Poisson  statistics  describe  the  non-Gaussian  behavior  of  the  finescale 
field,  the  problem  of  modeling  the  motion  field  in  this  regime  can  be  significantly 
advanced. 

Considering  the  thermocline  as  a  one-dimensional  statistical  process,  we  envision  a  set  of 
“Poisson  tracers,”  whose  vertical  position  is  tracked  from  one  realization  of  the  process  to 
the  next.  Poisson  statistics  describe  the  occurrence  of  these  tracers.  The  Poisson 
probability  function  gives  the  probability  of  occurrence  of  k  tracers  in  a  dimensional 
interval  of  length  H. 

=  =  (2) 

The  Poisson  probability  function  has  the  interesting  property  that  the  mean  number  of 
“events”  occurring  in  an  interval  //,  is  equal  to  the  variance  of  the  number  of  events. 

We  define  the  normalized  separation,  7,  between  two  Poisson  tracers  to  be  the  ratio  of  the 
instantaneous  separation  of  the  tracers  to  the  mean  separation,  The  strain,  7  -1,  is 
assumed  constant  over  the  interval  spanned  by  the  tracers.  Between  adjacent  tracer  pairs, 
values  of  strain  are  uncorrelated.  Thus  an  individual  realization  of  the  strain  profile  is 
discontinuous  (Fig.  5b).  However,  the  vertical  profile  of  a  passive  scalar,  6,  being  strained 
in  this  Poisson  field  is  continuous  (Fig.  5a).  The  profile  is  composed  of  a  series  of 
constant  gradient  segments  whose  statistics  are  easily  derived.  The  exponential 
distribution,  P(Az)  =  governs  the  probability  of  separation  between  adjacent 

Poisson  tracers,  as  well  as  the  distance  from  arbitrary  fixed  points,  to  the  adjacent 
tracers  (Papoulis  1984).  In  a  semi-Lagrangian  study,  the  statistics  of  a  specific  pair  of 
adjacent  tracers  are  followed  from  one  realization  to  the  next.  From  the  exponential 
distribution,  one  finds 
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(Az“)^=1/«-„  (3) 

Thus,  the  strain  variance  as  seen  in  a  “tracer-following”  frame  is  unity.  While  the  tracer 
separation  scale,  K^,  can  be  adjusted,  the  strain  variance  is  fixed  in  this  model. 

In  an  Eulerian  study  one  follows  that  pair,  trio,  or  quartet  of  tracers  that  brackets  the 
arbitrary  fixed  reference  depths  and  Z^.  Different  tracers  may  be  involved  from  one 
realization  to  the  next.  In  the  event  that  a  single  pair  of  tracers  brackets  the  reference 
depths,  the  separation  between  these  tracers  can  be  thought  of  as  the  sum  of  three  terms; 

Here  H  Zj).  Using  the  exponential  distribution,  it  is  easily  shown  that 

(Az)^=2»c-'+// 

( =  4  Kf  +  AkI^H  +  (4) 

In  the  limit  of  vanishing  separation,  //,  the  Eulerian  strain  variance  has  value  0.5.  This  is 
again  an  inherent  aspect  of  the  Poisson  model,  independent  of  the  adjustable  parameter  Kq. 

An  Eulerian  covariance  function  for  strain  can  be  derived. 

Here  Az^j  gives  the  ^paration  between  those  Poisson  tracers  that  bracket  depth 
while  Az^  gives  the  separation  between  those  tracers  spanning  depth  Z^.  The  brackets 
imply  averaging  over  separate  realizations  of  the  profile.  Given  the  hypotheses  of  the 
model,  if  one  or  more  Poisson  tracers  occur  between  points  Z„  and  Z*  the  strain  will  be 
uncorrelated;  Rj(Z„,Z^)  =  0.  If  the  points  Z^  and  Zj,  fall  between  successive  Poisson 

tracers,  they  will  experience  identical  strain. 
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For  this  case  the  covariance  is  given  by  =  (Papoulis  1984).  Here 

is  the  strain  variance  of  that  tracer  pair  that  is  bracketing  and  ^6. 


realization  after  realization.  Pq  is  the  probability  that  and  are  spanned  by  a  single 
pair  of  tracers.  This  is  identically  the  probability  that  no  Poisson  tracers  will  be  found  in 
the  interval,  /f,  between  the  reference  depths.  From  (2),  Pq  =  .  Combining  (2)  and 

(3),  one  has 


Rym= 


2 

a+K,H)^ 


e 


(6) 


The  corresponding  vertical  wavenumber  spectrum  of  strun  is  given  by 

S(k)  =  4k--'  Rep‘'*^"“''‘'£j(2(l  +  2mk/  k^))].  (7) 


Here  E2  is  the  exponential  integral  function  (Abramowitz  and  Stegun  1970).  The 
normalization  is  appropriate  for  a  one-sided  spectrum  with  k  in  cycles  per  meter.  The 
covariance  and  wavenumber  spectrum  of  strain  are  presented  in  Figures  5c, d. 

This  Poisson  model  of  the  thermocline  is  powerful  by  virtue  of  its  primal  simplicity.  The 
single  variable  Kq  describes  all  dimensional  aspects  of  the  model.  The  model  successfully 
links  the  strain  correlation  scale  rfo‘,  of  order  1  m  in  the  open-ocean  pycnocline,  with  the 
well-known  cutoff  of  strain  and  shear  that  occurs  near  10-m  scale  (Fig.  5d).  There  is  no 
need  to  invoke  a  critical  Richardson  number  criterion  here,  as  in  Munk  (1981).  The  model 
predicts  a  spectral  slope  slightly  steeper  than  the  classical  form,  at  scales  shorter  than 
10  m. 


The  spectral  level  in  the  low  wavenumber  limit  is  1 . 109  ,  for  a  one-sided  spectrum  with 

wavenumber  in  units  of  cycles  per  meter.  In  the  various  Garrett-Munk  models  of  the 
internal  wavefield  (e  g.,  Munk  1981),  the  strain  spectral  level  is  given  by  Sg^f(k)  =  n^Ebj, 
(Gregg  and  Kunze  1991),  where  E=6.3  x  10*^  is  the  dimensionless  internal  wave  energy 
parameter,  and  =  1 .3  x  10^  m  is  the  pycnocline  scale  depth.  The  G-M  model  best  fits  the 
Patchex  strain  spectrum  for  values  of  the  bandwidth  parameter  j,  (Sherman  1989).  For 
wavenumbers  0.01  <  it  <  0. 1,  the  Poisson  and  G-M  model  spectral  levels  are  comparable  if 
iCq  =  1.109/  i^Ebj,  =  1.37m"‘:  j,  =  1.  The  Poisson  approach  differs  from  the  G-M  model  in 
that  the  strain  variance  is  fixed.  The  form  of  the  wavenumber  spectrum  changes  as  the 
spectral  level  is  altered,  such  that  the  variance  is  preserved.  Also,  unlike  the  G-M 
approach,  the  Poisson  model  relates  variance  to  Newness,  kurtosis,  and  higher-order 
quantities  as  a  function  of  vertical  scale. 
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Figure  S.  A  model  vertical  profile  rtf  a  passive  scalar,  A  (a).  The  profile  consists  erf*  a  series  of  constant 
gradient  regioiis.  These  corre^nd  to  the  regions  of  constant  strain  (b),  whose  boundaries,  {z;},  vary  from 

realization  to  realization  as  a  Poisson  process.  The  strain,  {zj-  Zj^i )  /  Az  has  a  spatial  auto  covariance  (c) 

and  vertical  wavenumber  spectrum  (d),  here  evaluated  for  tCg  =  1.1  m*^  Note  that  the  Poisson  scale  Kq,  of 
order  1  m,  is  associated  with  a  cutoff  in  the  spectrum  at  a  scale  roughly  2n  times  larger. 


2.  FINESCALE  SHEAR  AND  RICHARDSON  NUMBER 
Sheco'  and  Strain 

The  apparent  success  at  modeling  the  strain  field  using  a  “reversible  fine  structure” 
approach  prompted  a  similar  investigation  of  fine-scale  shear.  An  initial  data  set  was 
collected  during  February  and  March  1990  in  the  surface  waves  processes  experiment, 
SWAPP.  A  ISS-kHz  Doppler  sonar  was  mounted  on  the  Research  Platform  FLIP  and 
operated  in  conjunction  with  the  profiling  CTD.  During  this  period,  FLIP  was  tri-moored 
at  35®N,  127®W.  Water  depth  at  the  site  was  approximately  4  km. 
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The  sonar  obtained  quality  estimates  of  water  velocity  over  the  depth  range  30-300  m, 
with  5.5  m  vertical  resolution.  It  operated  continuously  over  a  19  day  period.  However, 
during  the  central  period  of  the  cruise,  March  6  to  9,  a  large  front  passed  under  FLIP, 
significantly  altering  the  qualitative  nature  of  the  velocity  and  shear  fields.  We  restrict  the 
subsequent  study  to  the  period  before  frontal  passage,  to  avoid  the  atypical  regime. 

The  profiling  CTDs  were  similar  to  those  used  in  the  Patchex  Experiment.  In  SWAPP, 
however,  the  profiling  rate  was  increased  to  once  per  130  s,  rather  than  the  previous 
1 80  s.  The  increased  rate  was  selected  to  improve  CTD  derived  estimates  of  vertical 
velocity  and  strain  rate.  Profiles  were  achieved  from  the  surface  to  a  depth  of 420  m. 

SWAPP  represents  an  evolutionary  departure  from  previous  FLIP-based  examinations  of 
the  thermocline.  Rather  than  using  long  range  (~1 .2  km)  Doppler  sonars  of  relatively  low 
(15  m)  vertical  resolution  (e.  g.,  Pinkel  et  ai.  1987)  here,  a  shorter  range  system  with 
higher  resolution  is  used.  The  development  of  a  practical  scheme  for  coding  the  sonar 
transmissions  (Pinkel  and  Smith,  1992)  enables  the  improved  resolution  and  precision 
attained  in  SWAPP. 

For  the  first  time,  the  resolution  scale  of  the  sonar  approaches  the  vertical  displacement 
scale  of  the  internal  wavefield.  When  the  shear  field  is  closely  examined  the  distortion  due 
to  the  vertical  displacement  of  the  wavefield  is  clearly  seen.  In  Figure  6,  a  representative 
12-hour  segment  of  the  shear  field  is  presented.  Superimposed  on  the  plot  are  the  depths 
of  a  selected  set  of  isopycnal  surfaces.  These  illustrate  the  vertical  displacement  of  the 
wavefield.  Instances  where  the  low  frequency  shear  is  being  advected  by  high  frequency 
waves  are  seen  throughout  the  record. 

While  not  a  totally  unexpected  observation,  the  vertical  advection  of  the  low  frequency 
shear  by  high  frequency  internal  waves  represents  an  interesting  reversal  of  the  typical 
view  of  wavefield  kinematic  behavior.  It  is  more  common  to  think  of  long  wave-short 
wave  interactions  in  terms  of  the  short  (high  frequency)  waves  being  advected/refracted  by 
the  long  (low  frequency)  waves.  In  the  oceanic  thermocline,  the  “long”  near  inertial 
waves  can  have  shorter  vertical  wavelengths  than  the  “short”  (horizontal  wavelength)  high 
frequency  waves;  hence,  the  opportunity  to  observe  this  distortion. 

Modeling  Shear  and  Richardson  Number 

The  apparent  displacement  of  the  low  frequency  shear  field  by  high  frequency  internal 
waves  has  both  dynamic  and  kinematic  consequences.  Here,  we  focus  on  the  purely 
descriptive  problem,  the  appropriate  modeling  of  shear  and  Richardson  number  statistics. 
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Figure  6  suggests  that  the  statistics  will  be  quite  different,  depending  on  whether  the  data 
are  collected  in  an  Euierian  or  semi-Lagrangian  franK. 

The  issue  of  the  statistical  independence  of  shear  and  strain  is  critical  for  undo'standing  the 
evolution  of  the  Richardson  number.  Desaubies  and  Smith  (1982),  in  a  previous  attempt 
to  model  Ri,  assumed  the  independence  of  tl^se  quantities.  Munk  (1981),  in  a  separate 
study,  assumed  that  strain  was  effectively  constant;  only  shear  fluctuations  affected  the 
Richardson  number.  Figure  6  suggests  that  the  horizontal  velocity  and  shear  fields  are 
being  simply  advected  by  the  vertical  velocity.  We  can  avoid  the  kinematic  aspects  of  the 
problem  by  shifting  to  an  isopycnal  following  frame.  However,  a  first  order  dilemma 
remains.  In  the  semi-Lagrangian  frame,  is  the  shear, 

dz  izipi,t)-zip2,t)\  L  Az  J 

truly  independent  of  the  strain,  7,2?  If  so,  then  the  cross  isopycnal  velocity  difference 

du  — 

AwsM(p„f)-tt(p2,t)s  — /ijAz  (9) 

dz 


must  be  dependent  on  strain.  The  converse  also  holds.  It  is  not  possible  that  both  Air 
AND  du/dz  be  independent  of  strain. 

To  address  this  fundamental  issue,  a  separate  study  was  performed.  The  correlations 
between  shear  squared,  velocity  difference  squared,  and  inverse  strain  (Vaisala  frequency 
squared)  were  estimated.  Non-zero  correlation  between  quantities  precludes  the 
possibility  of  statistical  independence.  In  both  semi-Lagrangian  and  Euierian  fhunes,  the 
shear  squared  -  correlation  coefficient. 


"r  /  i  T 

((,(du/dz?f^~{Ou/dz?f  [(l^)“{y>'] 


(10) 


was  of  order  0.5,  for  average  spatial  separations  between  isopycnals  (semi-Lagran^an) 
and  vertical  differencing  intervals  (Euierian)  of  4-20  m.  (Desaubies  and  Smith  (1982), 
assumed  this  correlation  to  be  zero.) 
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2  2  1/2 

Figure  6.  Shear  magnitude,  [(dh  /  di)  +(.dvl  dz)]  plotted  as  a  function  of  depth  and  time.  Dailcer 
shading  represents  greater  values  of  shear  magnitude.  Solid  lines  represent  the  depths  of  a  selected  set  of 
isopycnals  of  uniform  mean  separation.  There  is  evidence  of  the  shear  layers  being  vertically  advected 
along  with  the  density  field  by  high  frequency  internal  waves.  This  is  seen  most  clearly  at  depths  80-200 
m.  Irregular  shear  variability  below  300  m  reflects  imprecision  in  the  sonar  velocity  measurement  at  great 
range. 

In  contrast,  the  correlation  between  velocity  difference  squared  and  was  negative,  of 
order  -0. 1  at  20  m  scales,  decreasing  to  -0.3  at  4  m  mean  isopycnal  separation.  The 
negative  correlation  indicates  that  larger  values  of  Aw^-were  seen  when  isopycnals  were  far 
apart  (small  N^),  while  smaller  velocity  differences  are  found  when  isopycnals  are  closely 
spaced  (large  N^). 

It  is  attractive  to  hypothesize  that  velocity  difference  and  strain  truly  are  uncorrelated. 

The  observed  correlation  could  result  from  the  finite  resolution  of  the  Doppler  sonar. 
Velocity  differences  along  isopycnals  are  unbiased  provided  isopycnal  separation  is  large 
compared  to  the  sonar  resolution  scale.  As  isopycnals  converge,  estimates  are  biased 
low. 

A  model  of  the  biasing  effect  was  created,  taking  care  to  account  for  the  non-Gaussian 
nature  of  the  Au^  and  fields.  The  model  assumed  the  actual  independence  of  these 
fields.  The  apparent  correlation  was  then  calculated,  after  modeling  the  effect  of  finite 
sonar  resolution.  The  agreement  between  modeled  and  observed  correlation  was  good, 
consistent  with  the  hypothesis  that  Au^  and  are  indeed  independent. 

Toward  a  Statistical  Model  of  Richardson  Number 

The  indications  of  Figure  6  and  the  correlation  studies  referred  to  above  suggest  a 
particularly  simple  approach  to  the  modeling  of  Richardson  number.  In  a  semi-Lagrangian 
frame,  consider 


{duldz)\r,Az)  {Au^(v,Az)t  Azr{t\Az^))  Au\t)lAz^ 


(11) 


It  is  convenient  to  define  a  scale  Richardson  number  Ri*  =  -, — ^ and  to  model  the 

{Au^)IA^ 

normalized  Richardson  number  R  =  Ril  Ri*  =  y(r)  I  r{t)  where 
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r(/)HAtt'(0/(Att').  (12) 

Note  that  the  scale  Richardson  number  Ri*  is  not,  in  general,  equal  to  the  expected  value 
of  the  Richardson  number  (kiy  We  proceed  by  recalling  that  the  pdf  of  y  is  given  by  the 
Ganuna  distribution  (1)  with  one  adjustable  constant,  which  appears  to  have  the 

near  universal  value  of  1 . 1 .  The  pdf  of  the  velocity  difference  between  two  isopycnais  has 
not  been  previously  investigated.  We  hypothesize  that  the  individual  components  of 
horizontal  (along  isopycnal)  velocity  difference  are  Gaussian.  Thus  Ai/^  represents  the 
sum  of  the  squares  of  two  Gaussian  quantities.  Its  associated  pdf  is  presumably  chi 
squared,  with  two  degrees  of  freedom.  At  two  degrees  of  freedom  the  chi  squared  pdf 
takes  on  exponential  form; 


1 

W)‘ 


P{r)  =  e-'. 


(13) 


In  Figure  7  the  probability  density  function  of  horizontal  velocity  difference  squared  is 
plotted  for  a  variety  of  mean  isopycnal  separations.  The  velocities  are  normalized  by  the 
mean  isopycnal  separation  to  produce  a  quantity  with  units  of  shear,  which  actually 
represents  the  statistics  of  squared  velocity  difference  between  moving  isopycnais.  The 
pdfs  are  very  nearly  exponential  in  form  with  a  velocity  difference  (shear)  variance  that 
increases  (decreases)  with  increasing  mean  separation  Az.  In  contrast  to  the  pdfs  of 
strain,  which  are  highly  skewed  at  small  separation,  becoming  nearly  Gaussian  as  Az 
increases,  the  pdfs  of  squared  velocity  difference  are  of  nearly  unchanging  form.  Only  the 
scale  variance  changes  significantly  with  mean  separation,  Az. 


Hypothesizing  the  independence  of  and  y,  we  can  form  the  joint  pdf  of  shear^  and 
strain. 


P(AM^y;Az) 


r(x)(A«^) 


(14) 


Here  K=  KqAz  and  (Am^)  are  functions  of  Az.  Identifying  the  normalized  Richardson 
number  /?(Az)  =  Ri / Ri'  =  ylr,vfe  can  integrate  (14) to  obtain 


K 

kR 

R{kR+\) 

xR+l 

(15) 
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Figure?.  Histograms 
of  incidence  of 
occunence  as  a 
fimction  of  velocity 
difference  squared, 
formed  in  a  semi- 
Lagrangian  frame. 
Velocity  differences 
are  normalized  by 

Az,  to  convert  to 
shear-like  units. 
Histograms  are  plotted 
for  mean  separations 
of  4,  8,  12,  16  and  20 
m.  An  exponential 
model  for  the  pdf  of 
squared  velocity 
difference  implies  a 
linear  form  for  these 
semi-logarithmic 
histograms. 


PiDtebilhy  Deasi^  of  acaii-Urangian  Shears 


Similarly,  in  an  Eulerian  frame  one  has 


K-(K-+1) 

{KR+\f 


r  n»f 

kR 
kR  +  1 


(16) 


Plots  of  the  pdf  of  normalized  Richardson  number  are  presented  in  Figure  8.  The  semi- 
Lagrangian  pdf  is  slightly  more  peaked  than  the  Eulerian  at  small  separations  Az.  This 
difference  decreases  with  increasing  mean  separation.  Again,  in  contrast  to  the  strain,  the 
skewness  of  these  pdfs  varies  only  weakly  with  increasing  mean  separation. 

The  initial  comparison  between  the  SWAPP  observations  and  the  model,  while 
preliminary,  is  quite  encouraging  (Fig.  9). 


3.  DISCUSSION 

To  predict  statistics  of  the  actual  (not  normalized)  Richardson  number,  at  scales  Az,  or 
depths  z  beyond  the  resolution  and  reach  of  the  SWAPP  instruments,  one  must  know  the 
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Figure  8.  Model 
probability  density 
functions  of  semi- 
Lagrangian  (solid)  and 
Eulerian  (dashed) 
normalized 
Richardson  number, 
for  mean  separations 
of  2- 10  m.  The 
difference  between 
semi-Lagrangian  and 
Eulerian  observations 
decreases  with 
increasing  vertical 
separation.  However, 
in  contrast  with  the 
pdfs  of  strain,  neither 
function  approaches 
Gaussian  form  at  large 
mean  separation. 


Model  PDR  of  Nonialized  Rkbudun  Humber 


the  behavior  of  and  as  functions  of  depth  and  mean  separation.  Strain  appears 
wel  l  modeled  at  scales  Az  >3  m  by  the  model  presented  above.  The  universality  of  the 
single  adjustable  parameter  Kq  is  open  to  question.  However,  the  behavior  of  (Aw^^  is 

even  less  well  known  In  this  study  the  estimates  are  influenced  by  instrument  noise, 
which  adds  to  the  true  variance,  and  instrument  resolution,  which  detracts  from  it.  Proper 
modeling  of  these  effects  is  required  for  accurate  estimates  of  Ri 

From  vertical  profiling  measurements,  there  is  a  body  of  experience  relating  to  the 
statistics  of  (Am^^,  at  least  in  an  Eulerian  frame.  Gargett  et  al.  (1981)  were  the  first  to 

synthesize  a  composite  shear  spectrum  from  a  variety  of  profiling  sensors.  They 
concluded  that  the  Eulerian  shear  spectrum  has  a  form  generally  similar  to  the  model  strain 
spectrum  presented  in  Figure  5d,  being  white  at  vertical  wavenumbers  less  than  0, 1  cpm 
and  of  At*  slope  at  higher  wavenumber.  The  shear  spectral  level  scales  as  (n^),  in  contrast 

to  the  strain  spectral  level,  which  is  independent  of  Vaisala  frequency.  The  Gargett  et  al 
(1981)  empirical  observation,  sustained  by  more  recent  research,  is  that  the  spectral 
transition  near  0. 1  cpm  is  not  a  strong  function  of  .  This  behavior  is  inconsistent  with 

linear  dynamics  in  the  WKB  approximation.  Vertical  wavenumbers  should  vary  as 
in  a  WKB  pycnocline. 
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Figure  9.  A 
comparison  between 
the  observed  pdf  of 
normalized  semi- 
Lagrangian 
Richardson  number 
and  the  corresponding 
model  pdf.  Agreement 
is  generally  good, 
although  a  clear 
systematic  offset  is 
seen.  Observed  values 
of  low  normalized 
Richardson  number 
occur  less  oAen  than 
predicted.  Instrument 
noise  and  resolution 
affect  both  the 
instantaneous 
Richardson  number 
observations  and  the 
value  of  the 
normalization  factor 
Ri*.  Data  from  depths 
150-184  m  are  used  in 
this  comparison. 


Nonnalized  RkhudiaD  Number 


0  OJ  1  IJ  2  IS  3  3.5  4  4.5  5 

Ri/Ri* 


If  the  Gargett  et  al.  scalings  are  applied  to  the  present  statistical  model  of  Richardson 
number,  the  scale  Richardson  number  Ri*  is  depth  independent.  The  frequency  of 
observance  of  instabilities  should  thus  be  independent  of  depth.  This  contrasts  with  the 
early  internal  wave  breaking  model  of  Garrett  and  Munk  (1972).  It  is  more  consistent 
with  the  later  view  of  Munk  (1981). 

There  are  several  major  concerns  with  the  Richardson  number  modeling  effort  presented 
here.  First  and  foremost,  observations  of  overturning  and  instability  in  the  thermocline 
typically  indicate  an  overturning  scale  of  a  few  meters  or  less.  The  model  developed  here 
is  not  supported  by  the  observations  at  scales  less  than  3  m.  In  part  this  is  due  to  the  noise 
and  resolution  limits  of  the  data.  However,  the  Poisson  strain  model  becomes  internally 
inconsistent  at  scales  smaller  than  Vg',  the  correlation  scale  of  the  strain  field.  The 
relevance  of  this  “finescale”  model  of  Richardson  number  variability  to  the  occurrence  of 
oceanic  turbulence  remains  to  be  demonstrated. 
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A  related  concern  is  that  the  Richardson  number  might  not  at  all  be  the  parameter  that  is 
most  sensitive  to  the  occurrence  of  oceanic  turbulence.  Orlanski  and  Bryan  (1969) 
suggested  that  a  second  form  of  instability,  termed  convective  instability,  was  responsible 
for  the  bulk  of  the  mixing  in  the  sea.  While  Munk  (1981)  argued  that  both  forms  of 
instability  were  sensitive  to  the  same  aspects  of  the  internal  wave  spectrum,  the  space/time 
distribution  of  convective  mixing  events  might  be  far  different  than  that  of  the  events 
resulting  from  low  Richardson  number  instability. 

To  investigate  this  concern,  a  microstructure  probe  was  mounted  on  the  CTD  used  in  the 
SWAPP  experiment.  The  sensor,  a  Seabird  dual  electrode  microconductivity  cell  was 
capable  of  resolving  overturns  on  scales  as  small  as  10  cm.  In  the  next  phase  of  the 
analysis  of  this  data  set,  we  will  attempt  to  correlate  the  occurrence  of  microstructure 
signals  with  the  depth-time  variation  in  fmescale  Richardson  number.  The  degree  of 
correlation  will  bear  testimony  to  the  relevance  of  the  Richardson  number  as  an  indication 
of  mixing  in  the  thermocline. 
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STRUCTURE  OF  THE  UPPER  OCEAN  VELOCITY 
FIELD  ON  SCALES  LARGER  THAN  10  KILOMETERS 
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Abst  act 

Upper  ocean  currents,  illustrated  here  by  shipboard  ADCP  data,  are  a  complex 
function  of  both  space  and  time.  Vertical  shear  is  strong  near  the  equator  and 
decreases  toward  the  poles.  Particularly  strong  currents  are  found  near  the  equator, 
in  the  southern  ocean,  and  on  western  boundaries.  High  variability  sometimes,  but 
not  always,  coincides  with  strong  mean  currents.  Inertial  oscillations  are  ubiquitous 
and  can  dominate  a  dataset.  Their  spatial  structure  has  not  been  well  observed.  An 
exploratory  attempt  to  calculate  horizontal  wavenumber  spectra  from  vertically 
averaged  shipboard  ADCP  measurements  shows  potentially  interesting  differences 
between  two  sections,  one  at  35°N,  the  other  near  18®N. 


Introduction 

Our  knowledge  of  upper  ocean  currents  is  sketchy.  The  broad  outlines  come  from 
statistical  summaries  of  ship  drift  reports  accumulated  over  more  than  a  century. 
This  global  dataset  shows  the  locations  and  typical  speeds  of  the  major  surface 
currents,  their  average  annual  cycle,  and  a  measure  of  their  variability  apart  from 
the  annual  cycle  (e.g.,  Wyrtki  et  al.,  1976;  Richardson  and  Walsh,  1986;  Richardson 
and  McKee,  1989).  The  horizontal  resolution  of  this  dataset  is  coarse,  typically 
1-5°,  and  it  indicates  only  currents  averaged  over  the  hull  depth  of  the  ships.  There 
are  many  sources  of  error,  such  as  the  effects  of  wind  and  waves  on  the  ship. 
Temporal  resolution  is  also  poor — the  dataset  is  climatological,  not  synoptic.  A 
second  source  of  information  about  upper  ocean  currents  is  the  hydrographic 
dataset,  from  which  the  geostrophic  component  of  the  currents  may  be  calculated  as 
a  function  of  depth,  not  just  at  the  surface  (e.g.,  Toole  et  al.,  1988;  Picaut  and 
Tournier,  1990).  When  treated  climatologically,  this  dataset  has  the  same  coarse 
resolution  as  ship  drift  data,  but  individual  hydrographic  sections  can  be  inspected 
for  a  quasi-synoptic  picture  of  the  geostrophic  current  perpendicular  to  the  ship 
track  with  a  horizontal  resolution  of  0.5°  or  so.  A  third  source  of  upper  ocean 
current  measurements  is  the  surface  drifter  dataset  (e.g.,  Hansen  and  Paul,  1984).  It 
gives  no  information  on  vertical  structure  but  gives  a  quasi-Lagrangian  picture  of 
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horizontal  and  temporal  variations  of  currents  at  10-15  m  depth.  It  has  recently 
been  shown  that  time-averaged  horizontal  gradients  of  currents  can  be  resolved  on 
scales  as  small  as  5  km  by  suitable  averaging  of  a  large  drifter  data  set  (Poulain, 
1993).  A  fourth  source  of  current  measurements  is  the  moored  current  meter 
dataset  (e.g.,  McPhaden  and  Taft,  1988;  Whitworth  et  al.,  1991).  Temporal 
resolution  is  excellent,  typically  one  hour  or  less.  Horizontal  resolution  can  be 
arbitrarily  fine,  but  horizontal  coverage  is  limited  by  the  cost  and  availability  of 
moorings.  An  array  rarely  includes  more  than  20  or  so  moorings. 

During  the  last  decade,  a  new  source  of  upper  ocean  current  measurements  has  been 
developed;  the  shipboard  Acoustic  Doppler  Current  Profiler  (ADCP).  An  ADCP  is 
now  standard  equipment  on  most  research  ships.  The  typical  instrument  (model 
VM-150  made  by  RD  Instruments)  can  measure  currents  relative  to  the  ship  at  8-m 
depth  intervals  from  about  20  m  down  to  a  maximum  range  of  200-450  m, 
depending  on  ambient  noise  and  the  density  of  acoustic  scatterers.  Individual 
profiles,  measured  once  per  second,  are  averaged  into  ensembles  of  a  few  minutes. 
The  accuracy  of  these  averages  is  usually  of  order  1  cm  s~‘,  although  biases  of  order 
10  cm  s~^  can  occur  (Chereskin  and  Harding,  1993;  Wilson  and  Firing,  1992).  The 
velocity  of  the  ship,  measured  by  differencing  position  fixes,  is  added  to  the  current 
profile  relative  to  the  ship  to  yield  a  profile  of  water  velocity  relative  to  the  earth. 
With  present  Global  Positioning  System  (GPS)  navigation,  95%  of  fixes  are  within 
100  m  of  the  correct  position.  Fix  errors  are  correlated  over  intervals  of  order  10 
minutes.  Velocity  errors  can  be  reduced  to  about  2  cm  s"'  standard  deviation  in 
each  component  by  differencing  fixes  30  minutes  apart.  With  a  typical  ship  speed  of 
6  m  s~*,  this  means  the  effective  horizontal  resolution  for  absolute  velocity  profiles 
is  about  10  km. 

The  purpose  of  this  note  is  to  show  something  of  the  character  and  complexity  of 
upper  ocean  currents.  We  will  use  ADCP  meeisurements  from  a  few  cruises  in  the 
Pacific  to  show  how  typical  current  speeds,  horizontal  scales,  and  vertical  shears 
vary  with  latitude.  We  will  illustrate,  but  not  solve,  the  problem  of  combined 
temporal  and  spatial  variability  in  the  shipboard  ADCP  dataset.  Examples  of 
simple  statistical  analysis  of  this  dataset  will  be  given.  They  will  show  some 
features  of  ocean  currents  that  have  not  been  accessible  previously  and  perhaps  help 
motivate  more  extensive  and  sophisticated  statistical  analysis  of  shipboard  ADCP 
data  in  the  future. 


The  Central  Pacific  from  35®N  to  60® S 


Recent  WOCE  (World  Ocean  Circulation  Experiment)  Hydrographic  Program 
(WHP)  cruises  provide  high-quality  current  and  hydrographic  measurements 
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spanning  the  Pacific.  Here  we  will  look  at  the  shipboard  ADCP  measurements  from 
the  central  and  southern  portions  of  WHP  lines  P16,  nominally  along  150®W,  and 
Pi 7,  nominally  along  135°W.  The  cruises  occurred  in  four  legs  on  two  ships:  RV 
Thomas  Washington  in  June  through  September  1991  (Talley  and  Swift,  1992)  and 
RV  Knorr  in  October  and  November  1992.  Along  most  of  the  cruise  track,  3-4-hour 
CTD  stations  were  occupied  at  half-degree  intervals. 

A  map  of  shallow  current  vectors  shows  how  the  character  of  the  current  field 
changes  with  latitude  (Figure  1).  Several  distinct  regimes  can  be  distinguished.  The 
region  that  probably  catches  the  eye  first  is  the  band  of  strong,  predominantly  zonal 
currents  within  about  10°  of  the  equator.  Typical  speeds  are  50  cm  s"*,  and  the 
widths  of  the  currents  are  2-5°.  The  main  currents  seen  here — the  eastward  North 
Equatorial  Countercurrent  (NECC),  and  the  westward  South  Equatorial  Current 
(SEC)  split  at  the  equator  by  a  shallow  fraction  of  the  eastward  Equatorial 
Undercurrent  (EUC) — can  be  identified  easily  in  most  cross  equatorial  sections  in 
the  central  or  eastern  Pacific.  Still,  as  the  difference  between  the  sections  on  135°W 
and  150°W  suggests,  their  variability  in  time  and  space  is  substantial. 

Poleward  of  the  equatorial  zone,  in  the  tropics  through  mid  latitudes,  the  typical 
currents  are  relatively  weak,  perhaps  20  cm  s~*,  and  their  horizontal  scale  of 
variability  is  only  1-2°  or  less.  What  we  see  in  these  regions  of  Figure  1  is  a  field  of 
eddies  and  other  variability  superimposed  on  a  weak  mean  flow.  The  typical  speeds 
and  the  horizontal  length  scales  decrease  with  increasing  distance  from  the  equator 
until  about  50°S,  the  northern  edge  of  the  Antarctic  Circumpolar  Current  (ACC). 
There  appe£U's  to  be  an  abrupt  decrease  in  eddy  energy  and  scales  at  about  30°N 
and  S,  separating  the  ocean  into  a  high-energy  extra-equatorial  region  from  10-30° 
and  a  low-energy  region  from  30-50°.  This  tentative  conclusion  needs  to  be  checked 
against  additional  datasets.  High  eddy  energy  can  also  be  seen  in  Figure  1  near  the 
Hawaiian  Islands,  cis  expected  (Patzert,  1969). 

On  150°W,  the  ACC  apears  to  extend  from  50°S  to  perhaps  62°S  as  a  series  of 
threads  1-2°  wide.  To  the  east,  however,  we  see  one  or  more  large  eddies  or  loops  in 
the  current.  It  appears  that  the  ACC  must  turn  north  just  east  of  150°W  and  then 
loop  south  at  about  140°W. 

To  see  how  the  vertical  structure  of  the  currents  varies  with  latitude,  we  turn  to 
representative  contoured  sections  (Figure  2).  Along  35°N  from  the  California  coast 
to  135°W,  most  of  the  major  current  features  axe  coherent  in  the  vertical  but 
decrease  in  amplitude  with  increasing  depth.  Typical  vertical  shears  are  of  order 
10“^  s~^ .  In  the  equatorial  band,  by  contrast,  distinctly  different  currents  are  found 
at  different  depths  in  the  upper  400  m.  The  eastward  Northern  Subsurface 
Countercurrent  (NSCC;  Tsuchiya,  1975),  for  example,  has  a  maximum  speed  of 
40  cm  s~^  at  4.5°N,  230  m.  Shears  in  the  equatorial  zone  reach  as  high  as 
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Figure  1.  Currents  averaged  from  25  to  75  m  on  the  central  and  southern  portions  of  WHP  lines  P16  and 
P17,  plus  transits  to  and  from  port.  These  ADCP  measurements  were  made  on  the  Thomas  Washington 
from  May  3 1  to  October  2, 1991,  and  on  the  Knorr  from  October  6  to  November  27, 1992.  The  Knorr 
cruise  went  from  Tahiti  to  62.S°S  and  back. 


UPPER  OCEAN  CURRENT  STRUCTURE 


Figure  2.  Upper  ocean  currents  off  the  California  coast  (a:  meridional  conqwnent),  near  the  equator  (b: 
zxmal  ocmqxMient),  and  in  the  Southern  Ocean  (c:  zonal  conqxment).  Southward  and  westward  flow  is 
shaded.  All  contours  are  at  10<m  sr'  intervals.  The  axes  are  scaled  uniformly  in  all  three  pands. 
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2  X  10  *  s  In  the  ACC  we  find  the  opposite  extreme;  shears  in  the  upper  4(K)  m 
are  typically  less  than  2  x  lO”**  s~*. 


Across  the  Pacific  at  10®N 

From  February  through  May  1989,  the  RV  Moana  Wave  crossed  the  Pacific  (Wijffels 
et  al,  1993).  The  cruise  was  run  in  three  legs  from  west  to  east,  mostly  along  9.5®N, 
just  north  of  the  boundary  between  the  NECC  and  the  North  Equatorial  Current 
(NEC).  CTD  casts  to  the  bottom  were  made  every  2“  of  longitude,  with  closer 
spacing  near  the  boundaries. 

Along  most  of  the  section,  the  zonal  component  of  current  is  westward,  part  of  the 
NEC  (Figure  3).  Eddy-like  variability  is  present  everywhere,  but  is  particularly 
strong  near  the  western  boundary  and  east  of  about  130“W,  The  dominant 
horizontal  scales  of  this  variability  appear  to  vary  from  1-5®.  The  signature  of 
tropical  instability  waves  (Hansen  and  Paul,  1984)  is  perhaps  most  evident  in  the 
strong  currents  near  120®W.  These  currents  are  very  shallow;  most  of  the  energy  in 
the  eastern  part  of  the  section  is  found  above  100  m. 

The  strongest  currents  of  the  section  are  found  within  10®  of  the  western  boundary. 
The  southward  flow  at  the  Philippine  coast  is  the  Mindanao  Current,  a  permanent 
western  boundary  current  (Lukas  et  al.,  1991).  Fortunately,  there  are  repeated 
sections  across  the  Mindanao  Current  from  several  measurement  programs;  we  will 
look  here  at  data  from  cruises  3,  4,  5,  6,  and  8  of  the  US/PRC  TOGA  Program 
(Delcroix  et  al.,  1992),  from  1987  to  1990.  Two  of  these  cruises  occurred  in  boreal 
fall,  three  in  boreal  spring.  The  mean  meridional  velocity  component  shows  the 
Mindanao  Current  and  little  else;  almost  all  of  the  region  from  129®E  to  the  end  of 
the  section  at  141 .5®E  has  a  mean  current  below  10  cm  s"*  (Figure  4).  The  mean 
Mindanao  Current  is  less  than  2°  wide,  has  a  maximum  speed  near  the  coast 
exceeding  80  cm  s“S  and  extends  below  the  350-m  depth  range  of  these 
measurements.  The  pattern  of  variability  differs  greatly  from  the  mean.  The 
standard  deviation  is  minimal,  only  5-10  cm  s“*,  at  the  coast,  where  the  Mindanao 
Current  is  strongest.  The  standard  deviation  then  increases  eastward  with  maxima 
greater  than  30  cm  s~'  on  the  edge  of  the  mean  Mindanao  Current  and  beyond  the 
edge  at  129®W.  East  of  there,  most  of  the  variability  is  found  in  the  upper  100  m, 
with  typical  standard  deviations  from  15-25  cm  s~*.  Below  100  m  the  standard 
deviations  are  mostly  5-10  cm  s*'.  There  is  no  obvious  seasonal  difference  in  the 
currents  in  this  dataset;  variations  among  sections  of  the  same  season  are  as  large  as 
variations  between  the  two  seasons. 
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Typhoon-generated  Currents 

So  far,  we  have  interpreted  shipboard  ADCP  measurements  as  showing  primarily 
the  spatial  structure  of  currents  along  a  section;  we  have  inferred  temporal 
variability  only  from  cruise- to-cruise  differences.  Given  this  mindset,  we  would  look 
at  Figure  5  and  conclude  that  there  was  an  extroardinary  series  of  eddies  south  of 
Samoa,  with  a  wavelength  of  3®  and  maximum  speeds  of  nearly  100  cm  s~*.  This 
conclusion  would  be  wrong. 

The  Moana  Wave  left  American  Samoa  for  New  Zealand  on  February  9,  1990,  just 
six  days  after  Typhoon  Ofa  passed  60  miles  west  of  Savai’i  in  Western  Samoa.  On 
February  13  the  Moana  Wave  track  crossed  the  path  of  Ofa  eight  days  before,  at 
about  19°S.  The  strong  currents  in  Figure  5  north  of  20°S  are  near-inertial 
oscillations  excited  by  Ofa’s  winds,  mainly  to  the  left  of  Ofa’s  path  where  the  wind 
direction  rotated  anticyclonically  (Lien  et  ai,  1993).  The  wavenumber  vector  for 
these  oscillations  is  along  Ofa’s  path,  roughly  perpendicular  to  the  ship  track. 
Therefore  the  currents  measured  from  the  moving  ship  can  be  treated  as  a  time 
series.  Looking  at  the  time  series  of  currents  as  a  function  of  depth  (Figure  6),  we 
see  that  currents  were  uniform  in  the  vertical  above  about  80  m,  presumably  the 
mixed  layer  depth.  Substantial  energy  had  propagated  below  the  mixed  layer  by  the 
time  of  these  measurements;  currents  at  200  m  were  at  times  as  strong  as.  or 
stronger  than,  those  in  the  mixed  layer.  Phase  propagation  was  upward,  consistent 
with  downward  energy  propagation,  and  there  is  a  corresponding  shift  to  higher 
frequencies  (blue  shift)  with  increasing  depth  (Price,  1983). 

Apart  from  its  interest  as  an  ocean  phenomenon,  this  instance  of  unusually  strong 
inertial  oscillations  illustrates  a  genered  problem  in  determining  the  spatial  structure 
of  ocean  currents:  measurements  almost  always  mix  spatial  with  temporal 
variability.  There  is  no  measurement  system  in  the  ocean  that  can  provide  broad 
spatial  coverage,  high  spatial  resolution  in  more  than  one  dimension,  and  good 
temporal  resolution,  all  at  the  same  time. 

Near-inertial  energy  can  be  identified  in  some  shipboard  ADCP  sections  even 
without  extraordinary  forcing  such  as  a  typhoon.  Wijffels  et  al.  (1993)  calculated 
frequency  spectra  of  currents  from  the  10®N  Moana  Wave  section  (Figure  3).  They 
found  a  prominent  near-inertial  peak  in  the  clockwise  spectrum  of  the  shear  between 
20  m  and  100  m,  and  concluded  that  the  zonal  wavenumber  must  be  small 
compared  to  27r  divided  by  650  km,  the  distance  the  ship  travelled  in  one  inertial 
period.  This  seems  reasonable  if  the  inertial  oscillations  are  excited  by  wind 
fluctuations  that  also  have  much  longer  zonal  scales  than  650  km.  The  expected 
meridional  wavenumber  is  larger  than  the  zonal  wavenumber  because  of  the 
variation  of  inertial  frequency  with  latitude  (D’Asaro,  1989).  At  low  latitudes,  the 
tendency  for  wind  fluctuations  to  have  larger  zonal  than  meridional  scales  should 
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Figure  6.  Current  vectors  (up  is  northward,  to  the  right  is  eastward)  as  a  function  of  time  and  depth, 
along  the  Moana  Wave  cruise  track  near  where  it  crossed  the  path  of  Typhoon  Ofa  in  Fdjruary  1990. 
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increase  the  anisotropy  in  the  near-inertial  wavenumber  spectrum.  This  spectrum 
has  not  yet  been  measured  definitively,  however. 


Horizontal  Wavenumber  Spectra 

Having  just  demonstrated  the  dangers  of  interpreting  shipboard  ADCP  sections  in 
terms  of  spatial  rather  than  temporal  variability,  we  will  proceed  to  do  just 
that — but  gingerly,  watching  out  for  temporal  signals.  Two  data  sets  will  be  used 
here:  the  WHP  P17  cruise  of  the  Thompson  (Figure  1);  and  a  cruise  of  the  Moana 
Wave  from  Pohnpei  to  Hawaii  in  July  1990  (MW9009;  Figure  7).  These  are  chosen 
because  they  include  fairly  long,  nearly  zonal  sections  at  different  latitudes  but 
within  the  mid-gyre  current  regime,  where  the  mean  flow  is  weaker  than  the  eddies. 

From  the  P17  cruise  we  will  use  the  transect  from  the  California  coast  to  135°W, 
nominally  along  35°N.  The  large-scale  flow  is  southward,  comprising  the  general 
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Figure  7.  Currents  averaged  /ront  25  to  75  m  on  a  Moana  Wme  cruise  (MW9009)  from  Pohnpei  to 
Hawaii,  July  9-25,  1990. 

southward  Sverdrup  flow  of  the  gyre  plus  the  California  Current  (Figure  2).  The 
section  is  1230  km  long.  It  was  sampled  by  block-averaging  intervals  of  0.05°  (3.155 
km).  Including  CTD  station  time,  the  ship  covered  the  13°  in  6.1  days.  The  inertial 
period  at  35°N,  20.9  hours,  thus  corresponds  to  about  2°  wavelength  along  the 
track.  If  the  zonal  wavelength  of  the  inertial  oscillations  was  much  larger  than  2°, 
then  the  near-inertial  spectral  peak  would  appear  at  1  cycle  per  2°  in  the 
wavenumber  spectrum  of  the  currents  measured  from  the  ship. 

From  MW9009  we  select  the  relatively  straight  transit  eastward  and  slightly 
northward  from  16°N  168°E  to  Oahu,  21°N  158°W.  For  convenience,  we  can  assign 
this  section  a  nominal  latitude  of  18°N.  The  westward  flow  of  the  North  Equatorial 
Current  is  apparent  only  on  the  western  half  of  the  section.  Like  PI 7,  this  section 
was  block- averaged  in  0.05°  longitude  (5  km)  bins.  There  were  no  pauses  in  the 
transit,  so  only  eight  days  were  required  to  cover  the  33.8°  (about  3570  km)  in 
longitude.  The  inertial  period  ranges  from  43.5  hours  at  16°N  to  33.5  hours  at  21°N; 
a  40-hour  period  corresponds  to  about  7°  along  the  track. 

Horizontal  wavenumber  spectra  were  calculated  from  the  Fourier  transforms  of 
128-point  segments,  overlapping  by  64-points.  The  segments  were  tapered  with  a 
parabolic  window  (Press  et  ai,  1986).  The  periodograms  for  each  segment  were 
averaged  to  yield  spectral  estimates  and  normalized  so  that  integrating  the 
single-sided  spectral  density  gives  the  total  variance  in  the  record.  There  were  four 
segments  giving  six  degrees  of  freedom  for  the  PI 7  spectral  estimates;  and  10 
segments  giving  16  degrees  of  freedom  for  MW9009. 

To  reduce  contamination  of  the  spectra  by  near-inertial  oscillations,  the  velocity 
vectors  were  vertically  averaged:  from  20-310  m  on  the  Pi 7  section,  and  from 
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20-200  m  on  MW9009  where  the  ADCP  depth  range  was  less.  Frequency  spectral 
analysis  of  the  P17  section  (not  shown)  indicates  that  the  20-310  m  vertical  average 
suppresses  the  near-inertial  peak  but  leaves  a  semi-diurnal  peak.  There  seems  to  be 
no  corresponding  peak  in  the  wavenumber  domain,  however,  perhaps  because  the 
ship  was  stopped  on  station  for  more  than  half  the  time.  In  the  time-domain  spectra 
of  the  MW9009  section  there  are  no  clear  near-inertial  or  semidiurnal  peaks  even  in 
the  shear  (200  rn  relative  to  20  m),  but  there  is  a  peak  near  the  diurnal  period  in 
both  the  shear  and  the  vertically  averaged  velocity.  This  appears  to  be  due  to  the 
eddy  field  traversed  by  the  ship;  if  so,  the  diurnal  period  is  simply  a  coincidence. 

The  horizontal  wavenumber  spectrum  at  35°N  falls  off  roughly  as  apart  from 
the  range  20-40  cptkm  (cycles  per  thousand  kilometers)  where  it  rises  above  the 
eyeball-fit  k~^  line  by  about  a  factor  of  3  (Figure  8).  The  energy  is  nearly  evenly 
divided  between  zonal  and  meridional  components,  but  over  most  of  the  range 
above  10  cptkm  there  is  an  excess  of  clockwise  (moving  westward  along  the  track) 
over  counterclockwise  energy.  Most  of  this  energy  is  above  20  cptkm,  well  above  the 
10  cptkm  wavenumber  where  we  might  expect  the  semidiurnal  tide  to  appear  in  this 
dataset  (unless  it  is  Doppler-shifted  substantially).  Hence,  the  cause  and 
significance  of  the  excess  in  clockwise  energy  are  unknown. 

The  wavenumber  spectrum  at  18°N  is  less  energetic  than  the  35°N  spectrum  above 
20  cptkm,  and  more  energetic  below  10  cptkm.  Above  25  cptkm  the  spectral  slope 
is  about  but  at  lower  frequencies  there  is  no  clear  single  slope.  There  is  no 
disparity  between  the  rotary  components  at  high  wavenumbers.  Below  10  cptkm, 
meridional  energy  exceeds  zonal  energy,  and  clockwise  (eastward  along  the  track) 
energy  exceeds  counterclockwise  energy. 

The  analysis  given  here  is  intended  as  no  more  than  a  first  exploration  of  the 
possibility  of  studying  the  horizontal  wavenumber  structure  of  upper  ocean  currents 
with  shipboard  ADCP  data.  It  shows  that  in  regions  of  the  ocean  away  from  strong 
mean  currents,  there  are  indeed  differences  in  the  wavenumber  spectra.  We  suspect 
that  part  of  the  difference  shown  here  between  sections  at  35°N  and  18°N  reflects 
the  difference  in  Rossby  radius  of  deformation:  the  eddy  energy  is  concentrated  at 
wavelengths  near  the  Rossby  radius,  which  is  larger  at  lower  latitudes.  Much  of  the 
difference,  however,  is  found  at  shorter  wavelengths,  and  this  remains  to  be 
explained. 


Discussion 

The  primary  theme  of  this  note  htis  been  the  spatial  variability  of  ocean  currents. 
Vertical  shear  in  the  upper  few  hundred  meters  varies  from  almost  nil  at  60°S  to 
0.03  s“^  near  the  equator.  Large-scale  mean  currents  vary  from  near  1  m  s“’  at  the 
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Figure  8.  Horizontal  wavenumber  spectra  of  cunents  from  a  section  along  (WHP  P17;  heavy  lines), 
and  from  16-20‘’N  (MW9009;  finer  lines).  In  the  top  panel,  solid  lines  show  the  spectra  of  zonal  velocity, 
dashed  lines  are  for  meridional  velocity.  In  the  bottom  panel,  solid  lines  show  the  clockwise  spectra, 
dashed  lines  the  counterclockwise  spectra.  Energy  density  units  are  m^s'^  per  cpm  (cycle  per  meter). 
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Mindanao  coast  to  less  than  0.1  m  s“^  400  km  offshore.  Eddies  are  ubiquitous,  but 
their  typical  amplitudes  and  length  scales  vary  from  place  to  place.  Away  from 
strong  currents,  both  amplitude  and  length  scale  tend  to  vary  inversely  with 
latitude. 

The  secondary  theme  has  been  the  complexity  of  temporal  variability,  and  in 
particular  the  near-inertial  oscillations  and  internal  tides.  We  have  shown 
near-inertial  oscillations  of  nearly  1  m  s~*,  albeit  caused  by  extraordinary  forcing:  a 
typhoon.  We  have  noted  the  potential  danger  in  looking  for  an  eddy  horizontal 
wavenumber  spectrum  in  shipboard  current  measurements,  inevitably  containing 
inertial  and  internal  waves  in  addition  to  the  eddies.  The  danger  is  reduced  but  not 
eliminated  by  vertical  averaging. 

Statistical  analysis  of  upper  ocean  velocity  measurements  is  clearly  in  its  infancy. 
Even  the  simplest  sorts  of  analysis — such  as  calculation  of  the  mean  and  standard 
deviation  of  currents  along  a  single  section — have  been  done  only  in  a  very  few 
places  and  with  very  few  measurements.  To  my  knowledge  there  has  been  no 
comprehensive  attempt  to  characterize  the  spatial  distribution  of  vertical  shear 
variance.  Horizontal  wavenumber  analysis  of  current  measurements  b  t  been 
attempted  rarely.  There  has  been  no  systematic  attempt  to  extract  statistical 
information  about  eddies  and  the  internal  wave  field  from  the  rapidly  growing 
shipboard  ADCP  data  set.  The  size  and  quality  of  this  data  set  are  rapidly 
approaching  the  point  where  extensive  statistical  analysis  will  be  fe^lsible.  I  expect 
it  will  be  fruitful  as  well,  shedding  light  on  the  small  and  mesoscale  phenomena  that 
until  recently  have  been  almost  impossible  to  observe  in  detail. 
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ABSTRACT 

Because  of  uncertainties  in  the  marine  geoid  and  orbit  height,  most  applications  of 
altimetric  data  have  focused  on  mapping  the  sea  level  variance  statistic.  These  stud¬ 
ies  have  been  very  successful  at  defining  the  geographical  distribution  of  eddy  variabil¬ 
ity  and  have  highlighted  the  close  relationship  between  transient  eddies,  the  intensity  of 
the  mean  flow  and  the  bathymetry.  Altimeter  data  have  also  been  used  to  estimate  sur¬ 
face  geostrophic  velocities  and  map  the  variance  of  geostrophic  velocity  (or,  equivalently, 
the  geostrophic  Reynolds  stresses).  These  studies  have  demonstrated  the  importance  of 
the  transport  of  horizontal  momentum  into  the  mean  flow  by  transient  eddies.  Other  ob¬ 
vious  applications  of  altimeter  data  include  mapping  the  time  evolution  of  the  sea  level 
field  for  studies  of  wind  and  buoyancy  forced  ocean  circulation  and  descriptive  studies  of 
mesoscale  processes  such  as  meandering  and  ring  formation.  Such  applications  are  lim¬ 
ited  by  a  number  of  difiicult  technical  challenges,  mostly  related  to  uncertainties  about 
what  space  and  time  scales  can  be  resolved  by  the  complex  space-time  sampling  charac¬ 
teristics  of  satellite  data.  A  method  is  presented  here  for  identifying  aliasing  patterns  in 
an  arbitrary  sample  design  and  for  quantifying  the  resolution  capability  of  the  data  set. 
Although  the  discussion  emphasizes  altimeter  data,  the  method  is  applicable  to  any  irreg¬ 
ularly  sampled  data  set.  The  maximum  resolution  capability  of  the  GEOSAT  orbit  con¬ 
figuration  (neglecting  measurement  errors  and  data  dropouts)  is  found  to  be  about  3°  in 
latitude  and  longitude  by  30  days. 

1.  INTRODUCTION 

The  TOPEX  altimeter  launched  in  August  1992  is  the  fifth  in  a  series  of  altimeter 
satellites  that  have  measured  the  global  sea  surface  topography  for  studies  of  ocean  circu¬ 
lation.  The  vast  majority  of  applications  of  altimeter  data  have  focused  on  the  statistics 
of  mesoscale  variability.  As  discussed  in  section  2,  this  is  because  the  effects  of  uncertmn- 
ties  in  the  orbit  height  and  the  marine  geoid  can  be  greatly  mitigated  if  the  interest  is 
restricted  to  sea  level  variance  statistics.  In  recent  years,  there  has  been  an  increasing 
interest  in  using  altimeter  data  to  map  the  time  evolution  of  sea  level  in  order  to  inves¬ 
tigate  the  detailed  spatial  and  temporal  structure  of  sea  level  variations  on  a  wide  range 
of  scales  and  relate  them  to  wind  and  buoyancy  forcing.  Although  examples  cam  be  cited 
from  the  literature  of  attempts  to  construct  quasi-synoptic  maps  of  mesoscale  eddy  fields 
with  ~50  km  spatial  resolution  from  altimeter  data,  it  should  be  obvious  that  the  infor¬ 
mation  content  of  altimeter  data  alone  is  not  sufficient  to  do  this;  because  of  the  asyn- 
optic  sampling  and  relatively  coarse  spacing  (100-300  km)  of  the  satellite  ground  tracks. 
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there  are  lower  limits  to  the  space  and  time  scales  that  can  be  resolved  by  the  data.  To 
date,  the  choice  of  scales  mapped  has  been  rather  ad  hoc,  with  few  attempts  to  assess  the 
accuracy  of  the  mapped  fields. 

The  objective  of  this  study  is  to  present  a  technique  for  quantifying  the  space  and 
time  scales  that  can  be  resolved  by  an  irregularly  sampled  data  set.  Although  the  partic¬ 
ular  interest  here  is  to  determine  the  resoluticui  capability  of  altimeter  data,  the  method 
is  equally  applicable  to  any  irregularly  sampled  data  set.  The  discussion  in  this  paper  is 
limited  to  the  GEOSAT  altimeter,  which  is  the  altimeter  data  set  that  has  received  the 
most  attention  because  of  its  long  (compared  with  other  altimeter  missions)  2-year  dura¬ 
tion. 

It  must  be  conceded  at  the  outset  that,  because  of  asynoptic  sampling  and  incom¬ 
plete  spatial  coverage,  some  degree  of  smoothing  is  required  to  construct  sea  level  maps 
from  altimeter  data.  The  technique  presented  here  offers  a  method  for  deducing  the  mini¬ 
mum  smoothing  necessary  to  avoid  undesirable  error  characteristics  in  the  mapped  fields. 
For  multidimensional  data  sets  such  as  altimeter  data,  the  best  choice  of  smoothing  is 
complicated  by  the  fact  that  there  is  a  resolution  tradeoff;  high  resolution  in  one  of  the 
dimensions  can  be  achieved  by  reducing  the  resolution  in  the  other  dimensions.  For  ex¬ 
ample,  high  resolution  in  time  can  be  obtained  by  sacrificing  spatial  resolution.  Similarly, 
high  resolution  in  space  can  be  obtained  at  the  cost  of  low  temporal  resolution.  Alterna¬ 
tively,  high  spatial  resolution  in  one  dimension  can  be  achieved  at  the  cost  of  low  spatial 
resolution  in  the  other  dimension.  The  best  choice  of  the  tradeoff  between  spatial  and 
temporal  resolution  will  depend  on  the  intended  application. 

A  brief  summary  of  previous  oceanographic  applicaticms  of  altimeter  data  is  given  in 
section  2.  The  section  concludes  with  a  statement  of  the  need  to  quantify  the  resolution 
capability  of  altimeter  data  in  order  to  construct  meaningful  maps  of  the  time  evolution 
of  sea  level  fields.  A  method  for  quantifying  the  inherent  wavenumber-frequency  filtering 
characteristics  of  an  irregularly  sampled  data  set  is  given  in  section  3.  The  filter  transfer 
function  depends  on  the  particular  sampling  characteristics  of  the  data  set  and  on  the 
choice  of  smoothing  parameters  used  to  construct  the  maps.  The  utility  of  the  transfer 
function  is  demonstrated  by  application  to  1-dimensional  examples  and  to  the  GEOSAT 
data.  An  expression  for  the  errors  of  the  smoothed  fields  is  derived  in  section  4  in  terms 
of  the  transfer  function  of  the  data  set  and  the  spectral  characteristics  of  the  field.  The 
transfer  function  and  error  formalisms  are  applied  in  section  5  to  determine  the  resolution 
capability  of  a  1-dimensional  example  and  of  the  actual  GEOSAT  data. 


2.  SUMMARY  OF  PAST  ALTIMETER  STUDIES 
2.1.  The  Measurement  Technique 

The  measurement  of  sea  surface  topography  by  satellite  altimetry  is  summarized 
schematically  in  Figure  1.  Although  the  altuneter  measurement  of  range  h  is  straight¬ 
forward  in  principle,  it  is  very  complex  in  practice,  involving  more  than  50  computer  al¬ 
gorithms  to  correct  for  instrumental  effects,  atmospheric  refraction  and  biases  introduced 
by  the  interaction  between  the  electroma^etic  radar  pulse  and  the  air-sea  inter&ce.  It 
is  remarkable  that  the  accuracy  of  the  range  estimates  after  applying  these  corrections  is 
better  than  one  part  in  10^.  The  range  measurements  alone  are  not  sufficient  for  oceano- 
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Figure  1.  Schematic  iqiresentation  of  altimeter  measurements. 


graphic  applications;  there  are  several  contributions  to  the  range  measurements  that  are 
not  part  of  the  oceanographic  signal  of  interest.  It  is  therefore  necessary  to  apply  several 
additional  external  geophysical  corrections  to  obtain  the  dynamic  sea  surface  topography 
hi  associated  with  geostrophic  ocean  circulation.  A  detailed  discussion  of  the  range  and 
external  corrections  is  beyond  the  scope  of  this  study;  the  interested  reader  is  referred  to 
Chelton  (1988)  and  Chelton  et  al.  (1989). 

By  far  the  largest  source  of  error  in  altimeter  estimates  of  sea  surface  topography  is 
the  correction  for  the  geoid  height  hg.  The  dynamic  range  of  the  geoid  is  almost  200  m 
globally,  which  is  about  two  orders  of  magnitude  larger  than  the  global  dynamic  range  of 
the  oceanographic  sea  surface  topography.  Uncertainties  in  the  geoid  height  are  presently 
about  30  cm,  which  is  comparable  to  the  magnitude  of  the  oceanographic  signal.  The 
geoid  problem  can  be  essentially  eliminated  if  interest  is  restricted  to  studies  of  time- 
varying  sea  surface  topography.  Because  temporal  variations  in  the  earth's  gravity  field 
are  negligible  over  the  duration  of  an  altimeter  mission,  the  geoid  signal  at  each  location 
along  an  exactly  repeating  ground  track  can  be  estimated  as  the  time  average  of  sea  level 
over  all  repeat  orbits  by  the  so-called  coUinear  analysis  method  (see,  e.g..  Appendix  A  of 
Cheney  et  al.,  1983;  sections  4.2  and  4.3  of  Chelton  et  al.,  1990).  Regrettably,  this  time 
average  also  includes  the  time-invariant  contribution  to  sea  level  from  the  mean  ocean 
circulation  but  this  signal  must  be  sacrificed  in  order  to  eliminate  the  geoid  problem.  Ig¬ 
noring  the  small  errors  introduced  by  the  ±1  km  lateral  variations  of  the  repeating  orbits, 
the  geoid  and  mean  ocean  circulation  contributions  can  be  removed  and  the  time-varying 
sea  surface  topography  can  be  investigated  from  the  residual  sea  level  signal. 
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The  second  largest  source  of  error  in  altimeter  data  is  the  correction  for  the  satellite 
orbit  height  H.  Until  recently,  uncertainties  in  the  orbit  height  have  been  about  50  cm. 
Preliminary  analyses  of  TOPEX  data  have  shown  that  advances  in  precision  orbit  deter¬ 
mination  have  reduced  the  orbit  errors  to  less  than  10  cm.  As  impressive  as  this  accu¬ 
racy  is,  there  is  still  a  need  to  estimate  and  remove  these  residual  orbit  errors  from  the 
altimeter  data  for  most  oceanographic  studies.  The  spectral  characteristics  of  orbit  er¬ 
rors  are  dominated  by  variability  at  1  cycle/rev  (Wagner,  1989).  If  the  interest  is  only  in 
mesoscale  variability  (wavelengths  shorter  than  about  1000  km),  the  very  long  wavelength 
orbit  errors  can  be  approximated  and  removed  from  the  data  by  least  squares  polynomial 
fits  over  data  arcs  of  2000-3000  km  (Zlotnicki  et  al.,  1989;  Tai,  1989;  1991).  For  studies 
of  sea  level  variability  on  larger  scales,  the  orbit  errors  are  more  appropriately  modeled  as 
sinusoids  with  a  frequency  of  1  cycle/rev  (Chelton  and  Schlax,  1993). 

The  overall  accuracy  of  altimeter  estimates  of  the  time-varying  component  of  sea  sur¬ 
face  topography  after  applying  all  of  the  corrections  and  removing  the  geoid  and  orbit 
errors  is  probably  6-8  cm  for  the  GEOSAT  altimeter,  although  this  is  difficult  to  quan¬ 
tify.  Because  of  significant  improvements  in  the  atmospheric  refraction  corrections  and 
the  orbit  height  estimates,  the  overall  accuracy  of  the  TOPEX  data  is  likely  to  be  smaller 
by  about  a  factor  of  two. 

While  the  estimation  of  sea  level  by  satellite  altimetry  is  much  more  technical  than 
that  by  tide  gauges,  there  are  a  number  of  problems  that  altimeter  and  tide  gauge  data 
share  in  common.  All  of  the  external  geophysical  corrections  that  must  be  applied  to  al¬ 
timeter  data  must  also  be  applied  to  tide  gauge  data.  The  primary  distincticm  between 
the  two  methods  of  sea  level  estimation  is  that  most  of  the  unwanted  contributions  to  the 
sea  level  measurements  are  easier  to  remove  from  tide  gauge  data.  For  example,  nearly 
all  of  the  tidal  signal  can  be  removed  by  low-pass  filtering  the  tide  gauge  data,  which  are 
typically  sampled  at  hourly  intervals.  Because  the  altimetric  estimates  of  sea  level  at  a 
given  location  are  sampled  at  widely  spaced  intervals  of  3-35  days  (depending  on  the 
satellite  orbital  configuration),  low-pass  filtering  is  not  possible.  The  tidal  signal  must 
therefore  be  removed  from  altimeter  data  on  the  basis  of  model  estimates  of  the  various 
tidal  constituents.  The  present  global  accuracy  of  tidal  models  is  believed  to  be  5-10  cm 
rms  in  the  open  ocean  (Ray,  1993).  The  correction  for  atmospheric  pressure  loading  (the 
“inverse-barometer  effect”)  is  also  easier  for  tide  gauge  data  since  measurements  of  atmo¬ 
spheric  pressure  can  usually  be  obtained  from  a  nearby  barometer.  Here  again,  altime¬ 
ter  data  require  model  estimates  of  sea  level  pressure  since  the  altimeter  observations  are 
globally  distributed  but  atmospheric  pressure  data  are  available  only  at  discrete  locations. 

The  correction  for  geoid  contributions  to  the  sea  level  signal  are  equally  difficult  for 
altimeter  data  and  tide  gauge  data.  There  are  few  cases  where  tide  gauges  have  been 
geodetically  levelled  to  a  common  reference.  Although  levelling  can  now  be  done  using 
astronomical  techniques,  it  is  a  costly  procedure  and  not  likely  to  be  done  in  the  near  fu¬ 
ture  for  the  global  tide  gauge  network.  As  with  altimeter  data,  the  geoid  problem  for  tide 
gauge  data  can  be  avoided  if  interest  is  restricted  to  studies  of  the  time  variability  of  sea 
level;  the  time-averaged  sea  level  can  be  removed  from  each  tide  gauge  record. 

Even  the  orbit  error  problem  of  altimetry  has  an  analog  in  tide  gauge  data.  The  level 
of  a  tide  gauge  relative  to  a  fixed  reference  can  vary  with  time.  Although  there  are  ex¬ 
amples  of  abrupt  changes  in  the  tide  gauge  datum  level  from  catastrophic  events  such  as 
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earthquakes,  the  sudden  collapse  of  a  pier,  or  relocation  of  the  gauge,  most  of  the  vertical 
motion  of  the  gauge  is  associated  with  very  slow  crustal  uplift  or  subsidence.  These  sec¬ 
ular  signals  are  easily  identified  and  removed  from  tide  gauge  data  by  simple  statistical 
techniques. 

2.2.  Mean  Sea  Surface  Topography 

Determining  the  surface  geostrophic  general  circulation  of  the  ocean  from  the  mean 
dynamic  topography  of  the  sea  surface  has  long  been  an  important  objective  of  satellite 
altimetry  (Wunsch  and  Gaposchkin,  1980).  When  combined  with  hydrographic  data, 
knowledge  of  the  absolute  sea  surface  topography  obtained  from  altimetry  would  solve 
the  reference  level  problem  of  the  dynamic  method  for  computing  geostrophic  velocity 
from  hydrographic  data.  This  is  one  of  the  primary  stated  goals  of  the  TOPEX  mission. 

It  is  also  the  most  challenging  goal  of  the  mission  because  it  places  the  most  stringent 
demands  on  the  accuracy  requirements  of  each  of  the  many  measurement  components 
needed  to  determine  the  dynamic  sea  surface  topography. 

As  noted  in  section  2.1,  the  two  largest  sources  of  error  in  altimeter  data  are  un¬ 
certainties  in  the  geoid  height  hg  and  the  orbit  height  H,  both  of  which,  until  recently, 
have  been  known  only  to  an  accuracy  of  about  50  cm.  This  is  comparable  to  the  ampli¬ 
tude  of  the  dynamic  topography  signal  of  interest.  Orbit  height  errors  have  decreased  to 
about  10  cm  for  the  TOPEX  data  that  are  beginning  to  become  available.  Geoid  errors 
have  similarly  decreased  by  constructing  a  global  geoid  from  combined  terrestrial  gravity 
measurements  and  satellite  tracking  data  using  the  method  described  by  Rapp  and  Pavlis 
(1990)  (see  Rapp  et  al.,  1991).  The  result  of  this  analysis  is  a  global  model  for  the  geoid 
height,  expressed  as  an  expansion  of  the  spherical  harmonic  functions,  with  an  estimated 
rms  error  of  about  30  cm  (Rapp,  1992).  The  geoid  accuracy  is  not  likely  to  improve  much 
beyond  this  without  a  low-altitude  dedicated  gravity-mapping  satellite  mission.  With 
present  technology,  it  is  possible  to  map  the  geoid  with  100  km  spatial  resolution  to  an 
rms  accuracy  of  about  3  cm  by  satellite.  Several  such  missions  have  been  proposed  in¬ 
ternationally  over  the  past  decade  but  none  have  yet  reached  approval  for  a  new  start. 
Until  such  a  geoid  model  becomes  available,,  studies  of  the  general  ocean  circulation  will 
be  limited  to  the  very  large  scales  that  are  known  accurately  in  presently  available  gravity 
fields. 

The  approach  that  has  been  used  most  commonly  to  estimate  the  mean  dynamic 
topography  from  altimeter  data  first  subtracts  the  range  measurements  h  from  the  esti¬ 
mated  satellite  orbit  heights  H  (see  Figure  1)  to  obtain  an  estimate  of  the  total  sea  sur¬ 
face  height  at  each  measurement  location.  These  sea  surface  height  estimates  are  then 
adjusted  to  mitigate  the  effects  of  time-dependent  orbit  errors  by  a  least  squares  proce¬ 
dure  that  approximates  the  predominantly  1  cycle/rev  orbit  errors  as  low-order  polyno¬ 
mials  or  sinusoids  as  discussed  in  section  2.1.  The  adjusted  sea  surface  heights  are  then 
interpolated  to  a  regular  grid  along  the  satellite  ground  track  and  a  gridded  mean  sea  sur¬ 
face  is  computed  as  the  arithmetic  mean  of  all  repeat  estimates  of  the  adjusted  sea  sur¬ 
face  height  at  each  grid  location.  The  mean  sea  surface  constructed  in  this  way  includes 
the  geoid  height,  the  mean  dynamic  topography,  the  geographically  correlated  orbit  error 
(defined  here  to  be  the  time-invariant  component  of  orbit  error  that  is  the  same  for  each 
repeat  sample  of  a  given  ground  track)  and  any  time-invariant  measurement  errors. 
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Because  the  geoid  is  expressed  as  a  global  spherical  harmonic  expansion,  the  method 
generally  used  to  estimate  the  mean  dynamic  topography  has  been  to  expand  the  ad¬ 
justed  altimetric  mean  sea  surface  as  a  spherical  harmonic  expansion  of  the  same  low  de¬ 
gree  and  order  to  which  the  geoid  is  known  accurately.  The  geoid  height  hg  expanded  to 
this  low  degree  and  order  is  then  subtracted  from  this  low-pass  filtered  mean  sea  surface. 
The  accuracy  of  the  resulting  estimate  of  the  low-order  spherical  harmonic  expansion  of 
the  mean  dynamic  topography  depends  not  only  on  the  accuracy  of  the  geoid  estimate 
at  these  large  scales  but  also  on  the  magnitudes  of  the  geographically  correlated  orbit  er¬ 
rors  and  time-invariant  measurement  errors  that  are  included  in  the  sea  surface  height 
estimates. 

An  example  of  the  application  of  this  straightforward  approach  by  Tai  (1988)  is 
shown  in  Figure  2a  based  on  three  months  of  SEASAT  data  expanded  to  degree  and 
order  8.  For  comparison,  the  mean  sea  surface  dynamic  topography  relative  to  2250  db 
computed  by  Levitus  (1982)  from  80  years  of  historical  hydrographic  data  is  shown  to  the 
same  degree  and  order  in  Figure  2b.  It  is  immediately  apparent  from  the  hydrographic 
data  that  this  low  degree  and  order  expansion,  which  corresponds  to  wavelengths  longer 
than  about  5000  km,  provides  only  a  crude  representation  of  the  true  dynamic  topogra¬ 
phy.  Even  the  major  gyre  structures  are  only  schematically  present  at  these  long  wave¬ 
lengths.  Higher  order  terms  of  the  spherical  harmonic  expansion  are  necessary  to  resolve 
the  strong  dynamic  height  gradients  associated  with  intense  currents  such  as  the  Gulf 
Stream. 


Figure  2.  Spherical  harmonic  expansions  to  degree  and  order  8  of  a)  the  mean  sea  level  computed  from  3 
months  of  SEASAT  data  with  the  GEM-Tl  geoid  removed;  and  b)  the  Levitus  (1982)  surface  dynamic 
height  field  (from  Tai,  1988.) 
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It  can  also  be  seen  from  Figure  2  that  there  are  large  discrepancies  between  the  al- 
timetric  and  hydrographic  estimates  of  the  mean  dynamic  topography.  Most  notable  is 
the  region  of  high  sea  level  in  the  altimeter  data  centered  at  about  15°S,  250°£  in  the 
eastern  Pacific.  There  is  also  a  region  of  low  sea  level  in  the  altimeter  data  from  the  In¬ 
dian  Ocean.  Although  the  differences  in  some  regions  such  as  the  poorly  sampled  areas 
of  the  southern  hemisphere  may  be  attributable  to  errors  in  the  hydrographic  data,  it 
is  more  likely  that  the  large  amplitude  features  in  the  eastern  Pacific  amd  Indian  Ocean 
arise  primarily  from  geoid  errors  and  geogr^>hically  correlated  orbit  errors.  Tai  (1988) 
argues  that  the  accuracy  of  geoid  models  has  improved  to  a  point  where  orbit  errors  are 
now  the  dominant  source  of  error  in  altimeter  estimates  of  the  mean  dynamic  topogra¬ 
phy.  Large  differences  between  the  mean  sea  surface  height  estimates  constructed  sepa¬ 
rately  from  ascending  and  descending  ground  tracks  at  the  crossover  points  attest  to  the 
presence  of  large  geographicaUy  correlated  orbit  errors  in  the  eastern  Pacific  and  Indian 
Ocean.  These  orbit  errors  can  be  attributed  to  the  poor  ground-based  tracking  coverage 
along  the  ground  tracks  that  pass  over  these  re^ons. 

Nerem  et  al.  (1990)  and  others  have  attempted  to  reduce  the  effects  of  geoid  and 
orbit  errors  on  altimetric  estimates  of  the  dynamic  topography  (and  at  the  same  time 
improve  estimates  of  the  geoid  height)  by  simultaneously  estimating  the  mean  dynamic 
topography,  the  geoid  height  and  the  orbit  errors  using  a  least  squares  inversion  proce¬ 
dure  first  suggested  by  Wagner  (1986).  Compared  with  the  earlier  estimate  by  Tai  (1988) 
shown  in  Figure  2a,  the  joint-solution  estimate  of  mean  dynamic  topography  to  degree 
and  order  10,  computed  from  51  days  of  GEOS  AT  data,  is  in  closer  agreement  with  the 
low-pass  filtered  dynamic  topography  from  hydrographic  data.  The  primary  reason  for 
the  improvements  in  the  GEOSAT-based  mean  dynamic  topography  when  compared  with 
the  earlier  estimates  from  SEASAT  data  is  likely  *^he  explicit  inclusion  of  orbit  errors  in 
the  Joint  solution.  Nonetheless,  there  are  still  large  differences  between  the  altimetric  and 
hydrographic  dynamic  topographies.  For  example,  the  Atlantic  Ocean  gyre  structure  is 
much  different  in  the  two  data  sets  and  there  are  very  large  discrepancies  in  the  Indian 
Ocean.  In  the  Pacific  Ocean,  the  gyre  centers  are  displaced  to  the  east  in  the  altimetric 
data. 

With  an  unprecedented  orbit  accuracy  of  better  than  10  cm  rms,  TOPEX  data  have 
introduced  a  new  era  in  absolute  sea  level  determination  by  satellite  altimetry.  The  dra¬ 
matic  improvement  in  the  accuracy  of  the  TOPEX  orbits  compared  with  previous  altime¬ 
ter  satellites  is  primarOy  attributable  to  more  complete  ground-based  tracking  coverage 
and  improved  orbit  modeling  because  of  the  reduced  drag  and  gravitational  effects  on 
the  satellite  at  the  higher  1300  km  TOPEX  orbit  altitude  (compared  with  the  800  km 
SEASAT  and  CEOSAT  orbit  altitudes).  Orbit  errors  are  no  longer  the  largest  source 
of  error  in  the  mean  dynamic  topography  constructed  from  altimeter  data;  errors  in  the 
geoid  height  are  now  the  primary  limitation.  Because  the  geoid  is  still  known  most  accu¬ 
rately  at  the  largest  scales,  accurate  altimetric  estimates  of  the  mean  dynamic  topography 
will  continue  to  be  limited  to  low  degree  and  order  terms  in  a  spherical  harmonic  expan¬ 
sion.  The  challenge  facing  oceanographers  is  to  develop  data  assimilation  techniques  that 
are  able  to  utilize  this  large-scale  information  to  constrain  ocean  models. 
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2.3.  Variance  Statistics 

The  geoid  and  geographically  correlated  orbit  errors  that  limit  the  accuracy  of  ab¬ 
solute  sea  level  determination  by  satellite  altimetry  are  of  relatively  little  concern  for 
studies  of  sea  level  variability.  As  discussed  previously,  the  time-invariant  geoid  and  ge¬ 
ographically  correlated  orbit  errors  (as  well  as  any  time-invariant  measurement  errors)  are 
included  in  the  mean  sea  level  computed  from  repeat-track  altimeter  data.  This  mean  sea 
level  is  removed  for  altimetric  studies  of  sea  level  variability.  After  removing  the  mean 
sea  level  or  <ts  part  of  the  mean  sea  level  estimation  (van  Gysen  et  al.,  1992;  Chelton 
and  Schlax,  1993),  the  time-dependent  orbit  errors  are  estimated  and  removed  by  one  of 
the  least  squares  techniques  outlined  in  section  2.1.  For  exact  repeat  orbits,  it  is  then  a 
straightforward  procedure  to  compute  variance  statistics  from  the  residual  sea  level  es¬ 
timates;  the  sea  level  variance  is  computed  as  the  arithmetic  average  of  the  squared  sea 
level  residuals  at  each  grid  location. 

Global  sea  level  variability  has  been  calculated  from  12  months  of  exact-repeat 
GEOSAT  data  by  Zlotnicki  et  al.  (1989)  (Figure  3a).  All  of  the  major  ocean  currents  are 
clearly  delineated  from  the  unique  global  perspective  afforded  by  altimeter  data.  The  re¬ 
gions  of  highest  mesoscale  sea  level  variability  are  coincident  with  the  axes  of  the  Gulf 
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Stream,  the  Kuroshio  and  the  Antarctic  Circumpolar  Current.  Sea  level  variability  is  also 
high  in  the  southwest  Atlantic  at  the  confluence  of  the  Brazil  and  Malvinas  Currents,  and 
in  the  East  Australia  Current.  High  mesoscale  variability  in  these  regions  is  not  unex¬ 
pected  in  view  of  the  fact  that  they  are  all  known  to  be  regions  of  hydrodynamically  un¬ 
stable  flow.  The  sea  level  variations  are  associated  with  transient  eddies  and  meanders  in 
the  flow. 

As  shown  by  Zlotnicki  et  al.  (1989),  the  amplitude  of  mesoscale  variability  deduced 
from  altimeter  data  is  sensitive  to  the  method  used  to  estimate  the  time-dependent  orbit 
errors.  The  Zlotnicki  et  al.  (1989)  map  of  rms  sea  level  variability  in  Figure  3a  was  ob¬ 
tained  using  second-order  polynomial  orbit  error  corrections  over  2500  km  data  arcs.  For 
comparison,  the  rms  sea  level  variability  derived  from  two  years  of  GEOSAT  data  based 
on  the  long-arc  (multiple  orbital  revolutions)  sinusoidal  orbit  error  corrections  of  Chel- 
ton  and  Schlax  (1993)  is  shown  in  Figure  3b.  The  patterns  of  sea  level  variability  are  the 
same  in  both  figures.  However,  the  rms  variability  is  larger  nearly  everywhere  by  a  few 
centimeters  in  the  long-arc  data.  While  some  of  this  additional  energy  is  real  ocean  vari¬ 
ability  that  has  been  removed  by  the  short-arc  polynomial  orbit  error  approximations, 
some  of  it  is  likely  attributable  to  the  larger  residual  orbit  errors  and  other  measurement 
errors  in  the  long-arc  data  discussed  by  Zlotnicki  et  al.  (1989).  More  accurate  orbit  esti¬ 
mates  and  geophysical  corrections  such  as  those  now  available  for  TOPEX  data  will  en¬ 
able  a  partitioning  of  this  variability  between  ocean  signal  and  measurement  errors. 

Although  sea  level  variance  studies  have  been  very  useful  for  mapping  the  geograph¬ 
ical  distribution  of  mesoscale  energy,  they  yield  little  insight  into  the  detailed  statisti¬ 
cal  characteristics  of  eddy  variability.  The  spatial  scales  of  mesoscale  variability  can  be 
investigated  from  the  wavenumber  distribution  of  sea  level  variance.  This  is  easily  de¬ 
termined  from  1-dimensional  wavenumber  spectra  of  altimeter  data  along  the  satellite 
ground  track.  Altimetry  is  the  only  observational  technique  that  can  provide  such  in¬ 
formation  because  of  the  difficulty  in  obtaining  synoptic  profiles  of  sea  level  from  in  situ 
measurements. 

Le  Traon  et  al.  (1990)  analyzed  the  2-year  GEOSAT  data  set  and  computed 
wavenumber  spectra  of  sea  level  variability  for  nineteen  areas  in  the  North  Atlantic.  The 
GEOSAT  measurement  errors  of  3-5  cm  allow  the  resolution  of  scales  as  short  as  about 
50  km.  The  spectra  for  six  regions  along  35°N  are  shown  in  Figure  4.  In  the  energetic 
western  portion  of  the  North  Atlantic,  the  sea  level  wavenumber  spectra  are  relatively  flat 
at  low  wavenumbers  with  a  broad  peak  centered  at  wavelengths  of  approximately  twice 
the  baroclinic  Rossby  radius  of  deformation.  These  peak  wavelengths  decrease  with  in¬ 
creasing  latitude;  peak  wavelengths  are  about  500  km  at  25'’N,  400  km  at  35°N,  300  km 
at  45°N  and  200  km  at  55°N.  These  values  are  consistent  with  the  baroclinic  Rossby 
radii  estimated  from  historical  hydrographic  data  by  Emery  et  al.  (1984).  At  wavelengths 
shorter  than  the  Rossby  radii,  the  spectra  drop  off  as  approximately  k~*,  compared  with 
the  dependence  expected  from  quasi-geostrophic  turbulence  theory  (Charney,  1971). 
The  weaker  slopes  in  the  GEOSAT  data  are  not  understood  at  present. 

East  of  the  Mid- Atlantic  Ridge  where  eddy  variability  is  much  weaker,  the  wavenum¬ 
ber  dependence  of  the  sea  level  spectra  ranged  from  about  k~^  to  k~^ .  In  contrast  to  the 
western  region,  the  spectra  in  the  eastern  basin  generally  did  not  flatten  at  wavelengths 
shorter  than  the  baroclinic  Rossby  radius  of  deformation.  This  implies  that  the  energy 
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Figure  4.  Wavenumber  spectra  of  sea  level  from  the  GEOSAT  altimeter  for  six  midlatitude  regions  across 
the  North  Atlantic.  (From  LeTraon  et  al.,  1990.) 
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source  of  turbulent  variability  in  the  eastern  basin  is  at  much  longer  wavelengths  (smaller 
wavenumbers)  than  in  the  energetic  western  region.  Le  Traon  et  al.  (1990)  suggest  that 
this  may  be  an  indication  that  the  eddy  variability  in  these  regions  of  low  energy  is  gener¬ 
ated  by  fluctuating  winds,  as  theorized  by  Frankignoul  and  Muller  (1979)  and  Muller  and 
Frankignoul  (1981).  The  wind  forcing  occurs  on  much  larger  scales  (order  1000  km)  than 
the  forcing  by  baroclinic  instabilities,  resulting  in  a  downscale  enstrophy  cascade  from 
smaller  wavenumbers. 

The  energetics  of  eddy  variability  can  be  investigated  from  the  geographical  distri¬ 
bution  of  eddy  kinetic  energy.  As  described  by  Menard  (1983),  this  is  easily  estimated 
from  cross-track  geostrophic  velocities  derived  from  along-track  sea  level  slopes  computed 
from  altimeter  data  if  the  eddy  variability  is  assumed  to  be  isotropic.  The  seasonal  vari¬ 
ability  of  eddy  kinetic  energy  estimated  in  this  manner  has  been  investigated  globally 
(with  emphasis  on  the  Gulf  Stream,  Kuroshio  and  Antarctic  Circumpolar  Current  re¬ 
gions)  from  two  years  of  GEOSAT  data  by  Shum  et  al.  (1990).  From  3-month  average 
estimates  of  eddy  kinetic  energy,  they  find  a  clear  meridional  migration  of  the  position  of 
the  Gulf  Stream  extension  east  of  60®W.  The  location  of  maximum  eddy  kinetic  energy 
shifts  northward  from  the  mean  location  by  several  degrees  of  latitude  during  the  sum¬ 
mer/autumn  and  then  southward  of  the  mean  location  by  about  the  same  distance  during 
the  winter/spring.  The  magnitudes  of  the  eddy  kinetic  energy  in  the  Gulf  Stream  region 
vary  over  the  two-year  record,  but  not  with  any  clear  seasonal  cycle.  The  maps  for  the 
Kuroshio  region  are  more  difficult  to  interpret,  perhaps  because  of  the  larger  number  of 
GEOSAT  data  dropouts  in  this  region.  Temporal  variations  of  eddy  kinetic  energy  are 
small  throughout  the  Antarctic  Circumpolar  Current  region  over  the  2-year  GEOSAT 
data  set. 

An  important  limitation  of  altimetric  studies  of  eddy  kinetic  energy  from  along-track 
sea  level  slopes  as  summarized  above  is  the  need  to  assume  isotropic  variability.  Drifter 
data  support  this  assumption  in  regions  of  low  to  moderate  eddy  energy.  However,  the 
eddy  variability  in  energetic  regions  such  as  western  boundary  currents  and  the  Antarc¬ 
tic  Circumpolar  Current  is  distinctly  anisotropic  (e.g.,  Richardson,  1983;  Daniault  and 
Menard,  1985;  Johnson,  1989)  Morrow  et  al.  (1992)  developed  a  technique  for  determin¬ 
ing  the  vector  surface  geostrophic  velocity  at  the  intersections  of  ascending  and  descend¬ 
ing  ground  tracks.  The  method  involves  calculating  cross-track  velocity  components  along 
each  of  the  ground  tracks  at  the  crossover  locations.  The  two  non-orthogonal  components 
are  then  converted  to  orthogonal  (north  and  east)  geostrophic  velocity  components  by  a 
simple  geometrical  transformation  first  suggested  by  Parke  et  al.  (1987).  The  resulting 
time  series  of  north  and  east  velocity  components  can  then  be  used  to  calculate  the  vari¬ 
ances  and  covariance  of  the  two  velocity  components,  from  which  velocity  variance  ellipses 
that  define  the  principal  axes  of  variability  can  be  derived.  A  current  ellipse  with  large 
eccentricity  represents  highly  anisotropic  variability  with  most  of  the  velocity  fluctuations 
aligned  parallel  to  the  major  axis  of  the  ellipse.  Correspondingly,  a  circular  variance  el¬ 
lipse  represents  isotropic  variability  with  no  preferred  direction  of  the  velocity  fluctua¬ 
tions.  The  dense  distribution  of  altimeter  crossover  locations  provides  a  much  higher  spa¬ 
tial  resolution  of  eddy  variability  than  can  practically  be  obtained  from  drifter  data. 

Application  of  the  technique  to  two  years  of  GEOSAT  data  in  the  Southern  Ocean 
reveals  energetic,  anisotropic  surface  geostrophic  velocity  variability  in  the  vicinity  of  all 
of  the  major  currents  (Figure  5).  The  orientations  of  the  velocity  variance  ellipses  rela- 


66 


CHELTON  AND  SCHLAX 


<0*  TO*  «0*  U*  40*  TO* 


Figure  5.  Sur&ce  geostrophic 
\«locity  variance  ellipses  from  2 
years  of  GEOSAT  data  at  the 
crossover  locations  of  ascending 
and  descending  ground  tracks  for 
a)  the  Agulhas  region;  b)  the 
southwest  Atlantic;  and  c)  the  east 
Australia^ew  2^ealand  region.  The 
scales  of  the  current  ellipses  in 
cm^/s^  are  shown  at  the  lower  right 
comer  of  each  plot.  (From  Morrow 
et  al.,  1993.) 
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tive  to  the  axis  of  the  mean  flow  have  important  implications  for  eddy  transport  of  hor¬ 
izontal  momentum;  where  the  ellipse  axes  are  aligned  perpendicular  and  parallel  to  the 
mean  flow,  there  is  no  cross-stream  transfer  of  along-stream  momentum.  Eddy  momen¬ 
tum  fluxes  in  the  Southern  Ocean  have  been  quantified  by  Morrow  et  al.  (1992;  1993)  by 
estimating  the  gradients  of  the  Reynolds  stresses  from  the  variances  and  covariances  of 
surface  geostrophic  velocity  components.  They  found  a  convergence  of  alongstream  mo¬ 
mentum  in  streamwise  integrated  Reynolds  stresses  along  the  mean  axis  of  the  Agulhas 
Return  Current.  This  is  an  indication  that  eddy  variability  in  this  region  tends  to  accel¬ 
erate  the  mean  flow,  consistent  with  recent  models  of  the  Antarctic  Circumpolar  Current 
(Tregieur  and  McWilliams,  1990;  Wolff  et  al.,  1991).  The  GEOSAT  data  reveal  a  surpris¬ 
ingly  complex  geographical  distribution  of  this  Reynolds  stress  convergence. 

The  broad  range  of  applications  summarized  in  this  section  illustrate  the  significant 
contributions  that  altimetric  studies  of  variance  statistics  for  sea  level,  eddy  kinetic  en¬ 
ergy  and  surface  geostrophic  velocity  have  made  toward  understanding  the  dynamics  of 
mesoscale  eddy  variability.  This  information  cannot  be  obtained  by  in  situ  observational 
techniques  on  the  scales  resolvable  by  altimeter  data.  To  date,  because  of  the  short  du¬ 
ration  of  the  SEASAT  data  set,  GEOSAT  data  have  been  most  useful  for  these  studies. 

It  is  ?n  unfortunate  fact  that  the  non-scientific  primary  objective  of  the  mission  (high- 
resolution  mapping  of  marine  geoid  for  defense  purposes)  resulted  in  a  number  of  inherent 
weaknesses  in  the  GEOSAT  mission  design.  Most  importantly,  there  was  no  onboard  mi¬ 
crowave  radiometer  for  the  wet  tropospheric  correction,  no  active  attitude  control  system 
(resulting  in  frequent  data  dropouts),  estimates  of  the  ionospheric  range  correction  were 
inaccurate  during  the  high  solar  activity  that  coincided  with  the  period  of  the  GEOSAT 
mission,  and  the  geographical  distribution  of  unclassified  ground-based  tracking  stations 
for  orbit  determination  was  very  sparse.  Despite  these  shortcomings,  GEOSAT  data  have 
provided  very  valuable  experience  with  altimeter  data,  while  at  the  same  time  yielding 
important  new  information  about  ocean  variability.  It  must  be  kept  in  mind,  however, 
that  all  of  the  results  obtained  to  date  are  compromised  to  an  unknown  degree  by  mea¬ 
surement  errors  with  a  wide  range  of  space  and  time  scales  (see,  for  example,  Jourdan  et 
al.,  1990,  and  Figure  9  of  Le  Traon  et  al.,  1990).  Much  improved  estimates  of  mesoscale 
variability  will  be  possible  from  the  more  accurate  TOPEX  data  that  are  now  becoming 
available. 

2.4.  Mapped  Fields  of  Sea  level  Variability 

The  examples  in  section  2.3  demonstrate  that  it  is  relatively  straightforward  to  com¬ 
pute  variance  statistics  from  altimeter  data.  For  many  applications,  the  statistics  of  the 
variability  are  not  sufficient.  For  example,  it  is  of  interest  to  map  the  spatial  and  tem¬ 
poral  evolution  of  the  sea  level  field  for  studies  of  the  dynamics  of  wind  and  buoyancy 
forced  ocean  circulation.  This  mapping  poses  a  much  more  difficult  problem  than  calcu¬ 
lating  variance  statistics.  As  shown  in  Figure  6a,  the  GEOSAT  ground  tracks  map  out 
a  diamond-shaped  grid  on  the  sea  surface.  The  dimensions  of  the  diamonds  at  middle 
latitudes  are  about  1.5“  of  longitude  by  3“  of  latitude  for  the  GEOSAT  17-day  repeat 
orbit.  These  dimensions  increase  for  shorter  orbit  repeat  periods;  the  dimensions  of  the 
diamonds  for  the  TOPEX  10-day  repeat  orbit,  for  example,  are  about  2.7“  of  longitude 
by  5.5“  of  latitude  at  middle  latitudes.  Clearly,  the  spatial  structure  of  mesoscale  vari¬ 
ability  cannot  be  resolved  on  all  scales  by  altimeter  sampling  grids.  Features  with  spatial 
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dimensions  shorter  than  roughly  a  few  hundred  kilometers  are  aliased  by  the  ground  track 
pattern. 

The  aliasing  problem  is  made  even  more  complicated  by  the  asynoptic  sampling  of 
the  altimeter  ground  track  pattern.  As  shown  in  Figure  6b,  there  is  a  3-day  subcycle  in 
the  GEOSAT  sample  grid;  the  ground  track  mapped  out  in  a  3-day  period  consists  of 
a  coarse  resolution  diamond-shaped  grid  with  dimensions  of  approximately  10®  of  lon¬ 
gitude  by  20®  of  latitude  at  middle  latitudes.  In  each  successive  3-day  period,  the  same 
diamond-shaped  pattern  is  mapped  out,  but  shifted  eastward  by  about  1.5®  of  longitude 
each  period.  The  complete  GEOSAT  ground  track  pattern  in  Figure  6a  is  thus  filled  in 
over  the  17-day  repeat  period.  This  systematic  space-time  coupling  of  the  sampling  char¬ 
acteristics  introduces  the  possibility  of  the  aliasing  of  propagating  sea  level  features  into 
the  mean  field  as  discussed  in  section  3.4. 

A  3-day  subcycle  is  a  common  characteristic  of  all  exact-repeat  altimeter  orbit  con- 
figuratims.  However,  the  direction  and  distance  of  the  3-day  shifts  of  the  coarse  resolu¬ 
tion  grid  depend  on  the  details  of  the  orbit  configuration.  For  example,  the  3-day  sub¬ 
cycle  of  the  TOPEX  orbit  also  shifts  eastward,  but  by  about  2.7®  of  longitude  because 
of  the  shorter  10-day  repeat.  In  contrast,  the  3-day  subcycle  of  the  ERS-1  35-day  repeat 
orbit  shifts  westward  by  about  1.5®  of  longitude. 
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The  effects  of  variability  not  resolved  by  the  irregular  sampling  pattern  can  be  mit¬ 
igated  by  some  degree  of  spatial  and  temporal  smoothing.  As  described  in  section  2.3, 
removal  of  the  geoid  height  and  orbit  errors  is  much  easier  for  exact-repeat  data  than  for 
a  nonrepeating  orbit  configuration  such  as  the  first  two  months  of  the  SEASAT  mission 
and  the  GEOSAT  18-month  Geodetic  Mission.  Mapping  fields  of  sea  level  variability  is 
therefore  greatly  simplified  from  exact-repeat  altimeter  data  using  the  collinear  analysis 
method  described  in  section  2.1.  In  addition,  the  availability  of  two  yeau’s  of  exact  17- 
repeat  GEOSAT  data  (with  a  third  year  of  partial  coverage)  has  provided  a  long  enough 
record  of  altimeter  data  to  begin  to  investigate  temporal  variability  of  sea  level  with  some 
(albeit  still  rather  limited)  statistical  reliability.  As  a  consequence  of  these  two  factors, 
there  has  been  a  great  proliferation  of  altimetric  studies  of  large-scale  sea  level  variability. 

Numerous  studies  have  documented  Kelvin  and  Rossby  wave  propagation  in  the 
tropical  Pacific  from  collinear  analyses  of  GEOSAT  exact-repeat  data.  As  these  waves  aure 
characterized  by  much  longer  zonal  than  meridional  scales,  these  studies  have  generally 
smoothed  the  data  to  a  resolution  of  8-10*  of  longitude  by  1-3“  of  latitude  by  one  month. 
An  example  from  Delcroix  et  al.  (1991)  is  shown  in  Figure  7.  Data  from  the  first  year 
of  the  GEOSAT  exact-repeat  mission  (November  1986-November  1987)  were  smoothed 
300  km  along  track  and  then  gridded  and  smoothed  into  approximate  10*  x  2*  x  1  month 
averages.  An  eastward  propagating  downwelling  equatorial  Kelvin  wave  characterized 
by  a  15  cm  positive  sea  level  anomaly  was  observed  beginning  in  December  1986,  coin¬ 
cident  with  a  strong  westerly  wind  anomaly  west  of  the  dateline.  Subsequently,  an  up- 
welling  equatorial  Kelvin  wave  with  10  cm  negative  sea  level  anomaly  was  generated  in 
January-Fcbruary  1987,  coincident  with  an  easterly  wind  stress  anomaly.  After  arrival  of 
this  second  Kelvin  wave  at  the  eastern  boundary  of  the  tropical  Pacific  in  March  1987,  a 
westward  propagating  baroclinic  Rossby  wave  is  evident  as  equatorially  symmetric  12  cm 
negative  sea  level  anomalies  centered  at  4°N  and  4°S.  The  surprising  result  that  the  ear¬ 
lier  downwelling  Kelvin  wave  did  not  reflect  as  a  Rossby  wave  is  explained  by  the  authors 
from  a  model  simulation  driven  by  observed  winds.  They  show  that  the  local  response  to 
wind  forcing  in  the  eastern  part  of  the  basin  tends  to  weaken  the  reflected  downwelling 
Rossby  wave,  but  enhances  the  reflected  upwelling  Rossby  wave.  Owing  to  the  short  1- 
year  record  length,  the  authors  are  not  able  to  determine  whether  the  observed  Kelvin 
and  Rossby  waves  are  associated  with  the  1986-1987  El  Nino  or  are  part  of  the  normal 
seasonal  cycle. 

Outside  of  the  tropics,  the  scales  of  sea  level  variability  are  dominated  by  eddy  dy¬ 
namics,  rather  than  the  wave-like  motions  in  the  equatorial  waveguide  (Robinson,  1983). 
The  appropriate  spatial  smoothing  is  thus  less  well  defined  than  in  the  tropics.  A  wide 
variety  of  smoothing  scales  have  been  adopted  in  the  literature,  all  of  which  are  rather  ad 
hoc.  In  some  regions,  the  spatial  scales  of  the  eddies  are  large  enough  to  be  resolved  by 
altimeter  data.  For  example,  the  average  diameter  of  eddies  formed  by  pinching  off  of  the 
Agulhas  Retroflection  is  more  than  300  km  (Lutjeharms  and  Ballegooyen,  1988).  Gordon 
and  Haxby  (1990)  have  tracked  seven  Agulhas  eddies  from  one  year  of  GEOSAT  data. 
After  detachment,  these  eddies  drift  northwestward  into  the  South  Atlantic  at  5-8  cm/s 
(Figure  8).  FVom  the  distribution  of  these  large  eddies,  they  estimate  that  about  five  ed¬ 
dies  per  year  are  shed  from  the  retroflecti<Hi  and  drift  into  the  Atlantic.  These  eddies  are 
important  to  the  mass  balance  of  the  world  oceans;  Gordon  and  Haxby  (1990)  estimate 
that  they  carry  as  much  as  10-15x10^  m^/s  of  Indian  Ocean  water  into  the  Atlantic.  In 
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Sea  Level  Anomaly  (cm}  along  4N  latitude 


Sea  Level  Anomaly  (cm)  along  the  EQuaior 


Figure  7.  GEOSAT  sea  level  anomalies  (deviations 
from  the  one-year  average)  as  a  function  of  time  and 
longitude  along  4°N,  the  equator  and  4°S  (top  to 
bottom).  Contour  intervals  are  2  cm,  and  the  0  and  2 
cm  contours  have  been  omitted  to  highlight  the 
eastward  and  westward  propagation.  (From  Delcroix 
etal.,  1991  ) 


addition,  Agulhas  eddies  support  a  large  heat  flux  from  the  ocean  to  the  atmosphere  as 
the  high  sea  surface  temperatures  of  these  features  quickly  cool  by  evaporation. 

More  generally,  the  spatial  scales  of  mesoscale  eddies  are  of  order  100  km,  which  is 
too  small  to  be  resolved  by  altimeter  sampling  grids.  An  eddy  that  is  detected  as  a  local¬ 
ized  bump  in  several  successive  profiles  of  sea  level  along  a  repeating  ground  track  even¬ 
tually  drifts  away  from  the  ground  track  and  disappears  into  a  diamond-shaped  region 
bounded  by  ascending  and  descending  ground  tracks.  At  some  later  time,  the  eddy  is 
likely  to  reappear  under  a  neighboring  ground  track.  Cheney  and  Marsh  (1981)  present 
an  example  from  exact-repeat  SEASAT  data  illustrating  the  disappearance  of  an  eddy 
over  a  3-week  period. 
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Figure  8.  The  trajectories  of  seven 
eddies  in  the  South  Atlantic  as 
determined  from  one  year  of 
GEOSAT  data.  Eddy  positions  and 
approximate  sizes  are  shown  as  the 
open  symbols  at  approximately 
1 -month  intervals.  Solid  dots 
represent  the  expected  position 
when  an  eddy  is  not  clearly  evident 
in  the  GEOSAT  data  because  of 
data  dropouts  or  the  location  of  the 
eddy  relative  to  the  ground  track 
pattern.  (From  Gordon  and  Haxlty, 
1990.) 


To  reduce  the  geophysical  noise  introduced  by  the  appearance  and  disappearance 
of  unresolved  eddies,  sea  level  maps  constructed  from  altimeter  data  must  be  smoothed 
over  large  enough  scales  to  eliminate  most  of  the  mesoscale  variability  (a  minimum  of  a 
few  degrees  of  latitude  and  longitude  by  perhaps  a  month).  Fields  constructed  in  such  a 
manner  have  been  analyzed  by  time-longitude  plots,  correlation  analysis  and  frequency- 
wavenumber  spectral  analysis  to  investigate  westward  propagation  along  selected  latitude 
lines.  Numerous  such  studies  have  found  surprisingly  clear  evidence  for  westward  propa¬ 
gation  at  approximately  the  annual  cycle  with  phase  speeds  that  lie  very  close  to  the  dis¬ 
persion  curve  for  baroclinic  Rossby  waves  (e.g..  White  et  al.,  1990;  Matthews  et  al.,  1992; 
Perigaud  and  Delecluse,  1992;  Pares-Sierra  et  al.,  1993;  Tokmakian  and  Challenor,  1993). 
However,  Jacobs  et  al.  (1992)  and  Schlax  and  Chelton  (1993)  have  cautioned  that  aliasing 
of  the  M2  tidal  period  in  the  GEOSAT  exact  17-day  repeat  data  is  manifested  as  west¬ 
ward  propagating  variability  at  near-annual  period  with  a  phase  speed  that  very  closely 
matches  that  of  the  first-mode  baroclinic  Rossby  wave.  Any  errors  in  the  model  M2  tidal 
constituent  used  to  correct  GEOSAT  sea  level  data  are  therefore  indistinguishable  from 
Rossby  waves.  Presently  available  tide  models  are  believed  to  be  accurate  generally  to 
4-5  cm,  but  are  known  to  be  uncertain  by  10  cm  or  more  over  large  areas  of  the  ocean 
(Wagner,  1991;  Ray,  1993).  Consequently,  all  studies  of  Rossby  wave  propagation  from 
GEOSAT  data  are  compromised  to  an  unknown  degree  by  aliasing  of  M2  tidal  errors. 

The  GEOSAT  orbit  configuration  was  thus  a  particularly  poor  choice  for  investigating 
Rossby  wave  dynamics.  The  TOPEX  orbit  has  been  carefully  selected  to  avoid  aliasing  of 
this  nature  for  any  of  the  major  tidal  constituents. 

The  tidal  aliasing  problem  can  be  reduced  by  smoothing  the  GEOSAT  data  zonally 
over  length  scales  longer  than  the  wavelength  of  the  M2  tidal  alias  (approximately  8°  of 
longitude  -  see  Jacobs  et  al.,  1992).  Chelton  et  al.  (1990)  examined  large-scale  sea  level 
variability  in  the  Southern  Ocean  from  two  years  of  GEOSAT  data  smoothed  to  a  res¬ 
olution  of  approximately  12“  of  longitude  by  6“  of  latitude  by  9  days.  The  variability 
was  dominated  by  the  seasonal  cycle,  with  a  zonally  coherent  annual  component  and  a 
semiannual  component  with  amplitude  and  phase  that  varied  over  the  three  major  basins 
of  the  Southern  Ocean.  The  variability  in  the  South  Atlantic  has  been  investigated  by 
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Matano  et  al.  (1993)  from  sea  level  fields  constructed  from  GEOSAT  data  with  somewhat 
higher  spatial  resolution  (6°  x  3°  X  1  month).  The  GEOSAT  data  show  that  the  confluence 
of  the  Brazil  and  Malvinas  Currents  migrates  seasonally  by  2-3°  of  latitude  from  a  most 
northerly  location  in  austral  winter  to  a  most  southerly  location  during  austral  summer 
(Figure  9).  Numerical  simulations  of  the  wind-forced  subtropical  gyre  of  the  South  At¬ 
lantic  and  GEOSAT  estimates  of  surface  geostrophic  velocity  both  indicate  that  the  phase 
of  the  seasonal  changes  in  the  latitude  of  the  confluence  coincide  with  opposing  seasonal 
variations  in  the  alongshore  transports  of  the  Brazil  and  Malvinas  Currents. 

From  the  applications  summarized  in  this  section,  the  potential  for  altimeter  data  to 
contribute  information  unobtainable  by  any  other  means  about  the  temporal  evolution  of 
sea  level  fields  has  been  clearly  demonstrated.  Despite  problems  with  measurement  errors 
(particularly  orbit  errors  and  the  wet  tropospheric  range  correction)  and  tidal  aliasing, 
GEOSAT  data  have  provided  new  insight  about  equatorial  wave  dynamics,  eddy  propaga¬ 
tion  and  large-scale  sea  level  variability.  An  unsettling  question  that  has  arisen  from  most 
direct  comparisons  with  in  situ  measurements  from  tide  gauges  and  hydrographic  data  is 
why  the  amplitudes  of  variability  inferred  from  GEOSAT  data  are  generally  somewhat 
small  (e.g.,  Menard,  1988;  Cheney  et  al.,  1989;  Tai  et  al.,  1989;  Chelton  et  al.,  1990;  Ar¬ 
nault  et  al.,  1990;  1992).  This  has  variously  been  attributed  to  signal  attenuation  by  the 
orbit  error  corrections  applied  or  to  excessive  smoothing  of  the  data.  In  order  to  assess 
the  impact  of  the  latter,  some  guidance  is  needed  to  determine  the  space  and  time  scales 
that  can  be  resolved  by  altimeter  data.  The  objective  of  this  study  is  to  determine  the 
minimum  smoothing  necessary  so  that  the  highest  possible  resoluticwi  is  retained  in  the 
sea  level  fields  constructed  from  altimeter  data. 


Figure  9.  A  time  history  of  the  latitude  of  the  confluence  of  the  Brazil  and  Malvinas  Currents  near  the 
continental  slope  of  South  America  as  determined  from  two  years  of  GEOSAT  data  (thin  line).  The 
smooth,  heavy  line  represents  a  least-squares  fit  of  atmual  and  semiannual  harmonics  to  the  raw  data. 
(From  Matano  et  al.,  1993.) 

3.  EQUIVALENT  TRANSFER  FUNCTION 
3.1.  Formalism 

The  question  of  the  resolution  capability  of  an  irregularly  sampled  data  set  is  inves¬ 
tigated  here  by  considering  a  simple  approach  to  smoothing  the  data  based  on  a  linear 
estimate  constructed  from  the  N  “nearest”  (in  space  or  time)  observations.  To  simplify 
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the  notation,  the  formalism  is  developed  for  a  1-dimensional  case;  extension  to  higher  di¬ 
mensions  is  straightforward.  The  jth  observation  of  a  stationary  stochastic  process  h(t) 
will  be  written  as 

Sj  = /i(tj) -I- Cj,  j  =  (1) 

where  t  is  time  and  ej  is  the  measurement  error  or  unresolved  geophysical  “noise”  in  the 
jth  observation.  The  general  form  for  a  linear  estimate  of  />  at  an  arbitrary  time  to  con¬ 
structed  from  these  N  observations  is 


N 

*(<o)  =  (2) 

i=i 

Note  that  the  aj  in  general  depend  on  the  estimation  time  to-  In  the  statistical  literature, 
Eq.  (2)  is  referred  to  as  a  smoother  and  the  smoother  weights  Oj  are  referred  to  as  the 
equivalent  kernel.  These  weights  can  be  specified  by  many  methods  (Buja  et  al.,  1989). 
Examples  include  moving  averages,  Gaussian  weighted  averages,  local  least  squares  fits 
to  a  polynomial,  local  weighted  least  squares  fits  to  a  polynomial  (“loess  smoothers”), 
natural  or  smoothing  spline  estimates  and  Gauss-Markov  estimates. 

The  form  of  the  linear  estimate  that  is  often  preferred  is  the  Gauss-Markov  estimate 
in  which  the  equivalent  kernel  is  computed  from  a  priori  specified  signal  and  noise  co- 
variance  functions  (see  Appendix  B).  Gauss-Markov  estimation  is  generally  referred  to 
as  “objective  analysis”  in  the  oceanographic  and  meteorological  literature  (e.g.,  Gandin, 
1965;  Bretherton  et  al.,  1976).  Examples  of  objective  analysis  applied  to  altimeter  data 
include  De  Mey  and  Robinson  (1987)  and  Fu  and  Zlotnicki  (1989).  If  the  covariance  func¬ 
tion  is  the  true  covariance  function  for  the  process  h{t),  the  Gauss-Markov  estimate  is 
optimal  in  the  sense  that  it  has  the  lowest  mean  squared  error  of  all  linear  estimates  of 
the  form  Eq.  (2).  In  practice,  the  optimal  estimate  generally  differs  little  from  other  lin¬ 
ear  estimates.  The  primary  advantages  of  the  optimal  estimate  are  that  the  formalism 
easily  allows  an  explicit  treatment  of  measurement  errors  and  provides  an  expression  for 
the  expected  error  of  the  estimate. 

The  disadvantage  of  Gauss-Markov  estimates  is  that  they  are  computationally  in¬ 
tensive  when  N  is  large.  For  this  reason,  we  have  found  it  more  useful  for  applications  to 
large  altimeter  data  sets  (see,  for  example,  Chelton  et  al.,  1990;  Matano  et  al.,  1993)  to 
apply  the  quadratic  loess  smoother  described  in  Appendix  A.  The  computational  require¬ 
ments  of  the  loess  smoother  are  much  lower  than  those  of  Gauss-Markov  estimates.  It  is 
shown  below  that  the  filtering  characteristics  of  the  quadratic  loess  smoother  are  nearly 
as  good  as  those  of  Gauss-Markov  estimates  (see  Figures  10,  19  and  20). 

When  the  observations  are  evenly  spaced  and  the  estimation  times  to  coincide  with 
the  observation  times,  the  filtering  properties  of  the  smoother  are  the  same  at  each  to, 
except  near  the  ends  of  the  sample  record,  where  edge  effects  become  important.  These 
end  regions  are  usually  discarded  so  that  the  frequency  content  is  uniform  throughout 
the  smoothed  time  series.  In  this  case,  the  filtering  properties  of  the  smoother  are  deter¬ 
mined  by  expressing  the  linear  estimate  in  the  form  of  a  convolution  of  the  observations 
and  then  applying  the  convolution  theorem  to  obtain  the  frequency  transfer  function  of 
the  smoother. 
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When  the  observations  are  irregularly  spaced,  the  filtering  properties  can  vary  con¬ 
siderably  with  to  (see  Figure  7  of  Schlax  and  Chelton,  1992),  and  the  convolution  theorem 
is  not  easily  applied.  As  shown  in  section  5.1,  it  is  desirable  to  choose  the  smoothing  pa¬ 
rameter  of  the  linear  estimate  so  that  the  filtering  characteristics  are  nearly  the  same  at 
all  to-  Otherwise,  the  frequency  content  of  the  smoothed  time  series  can  be  highly  nonsta¬ 
tionary  (see  Figure  14  below). 

The  filtering  properties  of  the  linear  estimate  are  easy  to  quantify  if  the  linear  esti¬ 
mator  Eq.  (2)  is  expressed  as  an  integral  over  t. 


where 


h(*o)=  f  p(t;to)ff(t)dt, 

oo 


^t;to)  =  ji]aj(to)^(t  -  tj) 


is  another  way  of  expressing  the  equivalent  kernel  in  terms  of  the  Dirac  delta  function. 

The  integral  expression  Eq.  (3)  can  be  expressed  in  the  frequency  domain  using  the  Power 
Theorem  (Bracewell,  1978)  as 

h(to)=  r  P*(f;to)G(f)df 

/OO  fOO 

P-if;to)H{f)df-\-  /  P-if,to)Nif)df 

■OO  7  —  00 

where  /  is  frequency,  G{f)  is  the  Fourier  transform  of  the  measurements  jr(f),  N{f)  is  the 
Fourier  transform  of  the  measurement  errors  f  and  P(/;to)  is  the  Fourier  transform  of 
p{t\to),  which  reduces  to 

(6) 
j  =  l 

P  is  referred  to  as  the  equivalent  transfer  function  (Schlax  and  Chelton,  1992),  since  it  is 
closely  related  to  the  equivalent  kernel  weights  Qj. 

In  three  dimensions,  the  equivalent  transfer  function  for  an  estimate  of  the  field  at 
location  {xo,yo,to)  is 


3=1 


where  k  and  /  are  the  zonal  and  meridional  wavenumbers.  The  sign  convention  adopted 
in  Eq.  (7)  defines  the  direction  of  props^ating  features  that  are  aliased  into  the  smoothed 
estimate.  For  example,  positive  k  and  /  correspond  to  eastward  propagation  (see  sec¬ 
tion  3.4). 

The  filtering  characteristics  of  the  smoother  are  clear  from  Eq.  (5);  the  equivalent 
transfer  function  specifies  how  the  frequency  content  of  the  measurements  gj  (both  the 
signal  and  noise  components)  are  filtered  by  the  linear  estimate. 
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Determination  of  the  equivalent  transfer  function  P  from  the  smoother  weights  aj 
can  be  computationally  intensive.  This  is  especially  true  for  larg^,  multi-dimensional  data 
sets.  A  method  for  computing  P  efficiently  and  with  high  wavenumber-frequency  reso¬ 
lution  by  a  fast  Fourier  transform  technique  is  presented  in  the  appendix  of  Schlax  and 
Chelton  (1992). 

3.2.  A  1-Dimensional  Example 

In  one  dimension,  the  quadratic  loess  smoother  used  here  to  investigate  the  resolu¬ 
tion  capability  of  irregularly  sampled  data  sets  is  obtained  by  a  weighted  least  squares  fit 
of  a  quadratic  function  of  t  to  observations  within  a  distance  dt  (referred  to  as  the  half¬ 
span  of  the  smoother)  of  the  estimation  time  to-  A  detailed  description  is  given  in  Ap¬ 
pendix  A. 

The  equivalent  transfer  functions  of  the  quadratic  loess  smoother  for  two  different 
half  spans  are  shown  in  Figure  10a  for  evenly  spaced  observations.  The  main  feature  of 
each  transfer  function  is  a  low-pass  band  with  near  unit  amplitude  and  a  sharp  cutoff  at 
a  frequency  of  /c  «  to  near  zero  values  at  higher  frequencies.  This  pass  band  defines 
the  smoothing  characteristics  of  the  linear  estimate;  the  frequency  content  of  the  observa¬ 
tions  Qj  is  rejected  at  frequencies  where  the  transfer  function  has  a  magnitude  of  zero  and 
is  fully  included  where  the  transfer  function  has  a  magnitude  of  one.  The  cutoff  frequency 
fc  can  be  decreased  by  increasing  the  span  of  the  quadratic  loess  smoother,  resulting  in  a 
smoother  time  series  of  estimates  (see  Figure  10a). 


Figure  10.  The  I -dimensional 
equivalent  transfer  function 
modulii  of  the  quadratic  loess 
smoother  for  a)  an  evenly  spaced 
sample  design  with  sample  interval 
A  =  1  and  smoothing  parameters 
d,-30  (solid  line)  and  60  (dashed 
line);  and  b)  an  irregularly  spaced 
sample  design  with  d,  =30.  In  both 
panels,  the  estimation  point  is  at 
the  midpoint  of  the  data  record. 
Note  that  the  frequency  axis  is 
logarithmic. 


The  series  of  peaks  with  successively  narrower  width  in  the  logarithmic  plot  of  the 
equivalent  transfer  function  are  aliases  of  the  low-pass  band,  folded  about  the  Nyquist 
frequency  =  (2A)“^,  where  A  =  1  is  the  sample  interval.  The  aliasing  peaks  are  cen- 
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tered  at  even  multiples  of  /^.  In  a  linear  plot,  the  widths  of  each  of  these  alias  peaks  are 
the  same  as  the  2/c  width  of  the  central  low-pass  band  that  is  symmetric  about  zero  fre¬ 
quency.  If  there  is  any  energy  in  the  signal  or  noise  at  these  higher  frequencies,  it  will  be 
aliased  into  the  low-pass  band  and  indistinguishable  from  actual  low  frequency  variability. 

For  an  ideal  filter,  the  equivalent  transfer  function  would  drop  abruptly  from  a  mag¬ 
nitude  of  one  to  a  magnitude  of  zero  at  the  cutoff  frequency  fc  and  would  remain  zero  at 
all  higher  frequencies.  The  more  gradual  low-pass  band  edge  rolloff  and  the  alias  peaks  of 
real  smoothers  represent  imperfections  of  the  real  filtering  operation. 

The  equivalent  transfer  function  for  an  example  of  irregularly  spaced  observations  is 
shown  in  Figure  10b.  The  low-pass  band  of  interest  is  very  similar  to  that  for  the  evenly 
spaced  sample  design  with  the  same  dt  shown  in  Figure  10a.  The  primary  difference  is 
the  noisy  continuum  of  energy  in  the  transfer  function  for  the  uneven  design  at  frequen¬ 
cies  higher  than  fc.  The  details  of  these  high-frequency  characteristics  of  the  equivalent 
transfer  function  depend  on  the  particular  sample  design  and  on  the  estimation  time  to. 
Just  like  aliasing  for  the  case  of  evenly  spaced  observations,  any  energy  in  the  signal  or 
noise  at  these  frequencies  higher  than  fc  will  contaminate  the  lower  frequencies  that  are 
of  interest  in  the  smoothed  estimates.  The  greater  the  amplitude  of  the  equivalent  trans¬ 
fer  function  at  the  higher  frequencies,  the  less  efficiently  the  smoothed  estimates  will  iso¬ 
late  the  low-frequency  content  of  the  signal  of  interest.  Although  aliasing  loses  its  clas¬ 
sical  meaning  when  the  observations  are  irregularly  spaced,  this  high-frequency  contami¬ 
nation  in  the  equivalent  transfer  function  will  be  referred  to  here  as  aliasing,  for  lack  of  a 
better  term. 

While  the  band-edge  rolloff  of  the  quadratic  loess  smoother  is  not  quite  as  sharp  as 
for  Gauss-Markov  estimates  when  the  signal-to-noise  ratio  is  high  (compare  Figure  10 
with  Figures  19a  and  20a  in  Appendix  B),  it  is  sharper  than  those  of  other  commonly 
used  smoothers  (see  Schlax  and  Chelton,  1992),  as  well  as  Gauss-Markov  estimates  when 
the  signal-to-noise  ratio  is  small  (Figures  19c  and  20c).  For  most  purposes,  the  slightly 
less  efficient  filtering  characteristics  are  compensated  for  by  the  much  greater  computa¬ 
tional  efficiency  of  the  quadratic  loess  smoother;  in  application  to  large  3-dimensional 
data  sets  such  as  altimeter  data,  Gauss-Markov  estimates  require  about  two  orders  of 
magnitude  more  computing  effort  and  are  therefore  not  practical  for  studies  on  basin 
scales. 

3.3.  The  GEOSAT  Ground  Track  Pattern  Sampled  Synoptically 

The  combined  space  and  time  characteristics  of  the  satellite  sampling  pattern  com¬ 
plicate  interpretation  of  the  equivalent  transfer  function.  The  separate  effects  of  spatial 
and  temporal  sampling  become  clearer  if  time  dependence  is  first  neglected  and  synoptic 
sampling  of  the  ground  track  pattern  in  Figure  6a  is  considered;  the  effects  of  asynoptic 
sampling  of  this  grid  are  examined  in  section  3.4. 

The  2-dimensional  wavenumber  equivalent  transfer  function  for  a  quadratic  loess  es¬ 
timate  constructed  from  the  GEOSAT  17-day  sample  grid  is  shown  in  Figure  11  for  an 
estimation  location  at  a  point  where  ascending  and  descending  ground  tracks  cross.  For 
the  purposes  of  this  discussion,  the  GEOSAT  data  were  subsampled  at  intervals  of  50  km 
along  the  ground  tracks.  All  of  the  information  about  the  spatial  regularity  of  the  sample 
grid  is  contained  in  this  figure.  The  transfer  function  is  symmetric  about  both  wavenum- 
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Figure  11.  The  2-dimensional  wavenumber  equivalent  transfer  function  modulus  for  the  GEOS  AT  ground 
track  pattern  (see  Figure  6a)  sampled  synoptically  for  estimation  point  (Xq^Vq)={45®W‘’,30°N)  and 
quadratic  loess  smoothing  parameters  (d'jpdp=(5®,3'’).Thcse  smoothing  parameters  were  chosen 
somewhat  arbitrarily  to  illustrate  the  aliasing  patterns  inherent  in  the  diamond  shaped  sample  grid.  Note 
that  the  wavenumbers  axes  are  linear  in  this  figure. 


ber  axes.  The  elliptical  plateau  centered  at  zero  that  drops  off  steeply  to  generally  small 
values  at  higher  wavenumbers  is  the  low-wavenumber  pass  band  of  the  smoother.  The  as¬ 
pect  ratio  of  this  pass  band  (longer  in  the  meridional  wavenumber  direction  than  in  the 
zonal  wavenumber  direction)  is  the  inverse  of  the  ratio  of  smoothing  spans  dy/dx  =  3/5. 
The  other  subsidiary  peaks  (wjth  the  same  aspect  ratio  as  the  low-frequency  pass  band) 
are  aliasing  peaks  that  arise  because  of  the  very  regular  diamond-shaped  grid  of  crossover 
points.  At  the  30“  latitude  of  the  estimation  location,  the  dimensions  of  the  diamond  pat¬ 
terns  mapped  out  by  the  ground  tracks  are  approximately  1.5°  of  longitude  by  3°  of  lati¬ 
tude  (see  Figure  6a).  The  corresponding  Nyquist  wavenumbers  are  about  ks  =  0.0036  cy¬ 
cle/km  (cycles  per  km)  and  /jv  =  0.0015  cycle/km.  The  minima  between  the  aliasing 
peaks  and  the  maxima  of  the  peaks  are  centered  at  odd  and  even  multiples,  respectively, 
of  these  Nyquist  wavenumbers  (compare  with  the  1-dimensional  example  in  Figure  10a). 
The  coarser  ground  track  pattern  of  a  shorter  orbit  repeat  period  would  result  in  larger 
diamond  patterns  and,  hence,  lower  Nyquist  wavenumbers  and  more  closely  spaced  alias¬ 
ing  peaks.  For  example,  for  the  approximate  2.7°  of  longitude  by  5.5°  of  latitude  dia¬ 
monds  of  the  TOPEX  10-day  repeat  orbit,  the  series  of  aliasing  peaks  overlap  because 
the  smoothing  parameter  dy  =  5°  is  too  short  for  the  TOPEX  sample  grid. 

The  diagonal  patterns  of  regularly  spaced  aliasing  peaks  are  thus  an  indication  of 
the  non-rectangular  grid  pattern  of  the  crossover  points.  The  tilting  of  the  lines  through 
the  centers  of  these  aliasing  peaks  are  an  indication  that  aliased  features  in  the  sea  level 
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field  are  tilted  parallel  to  the  satellite  ground  tracks.  The  slopes  of  the  lines  through  the 
aliasing  peaks  define  the  angles  of  the  ground  tracks  in  the  spatial  domain.  In  the  ex¬ 
treme  case  of  an  orthogonal  grid  aligned  east-west  and  north-south  (which,  of  course,  is 
not  possible  for  a  satellite  orbit  but  is  typical  of  sampling  grids  for  other  types  of  data), 
the  aliasing  peaks  would  lie  along  lines  parallel  to  the  wavenumber  axes. 

A  second  spatial  scale  is  embedded  in  the  regular  pattern  of  the  transfer  function 
in  Figure  11.  At  the  30“  latitude  of  the  estimate,  the  50  km  sample  interval  along  the 
ground  track  represents  zonal  and  meridional  sample  intervals  of  about  20  km  and  46  km, 
respectively.  The  corresponding  Nyquist  wavenumbers  are  ks  =  0.025  cycle/km  and 
/iv  =  0.011  cycle/km.  These  Nyquist  wavenumbers  define  the  intersection  points  of  the 
diagonal  patterns  of  aliasing  peaks;  the  intersections  occur  at  odd  multiples  of  the  zonal 
and  meridional  Nyquist  wavenumbers  of  the  aJong-track  sample  interval.  Sampling  at 
closer  intervals  along  the  ground  track  would  result  in  higher  Nyquist  wavenumbers  and, 
hence,  larger  diamond  patterns  of  the  equivalent  transfer  function  in  wavenumber  space. 

3.4.  The  3-DimensionaI  GEOSAT  Data  Set 

When  the  asynoptic  sampling  of  the  satellite  ground  track  is  taken  into  considera¬ 
tion,  visualization  of  the  3-dimensional  equivalent  transfer  function  is  much  more  difficult 
than  for  the  2-dimensional  sample  grid  considered  in  section  3.3.  As  an  example  of  the 
ability  of  the  equivalent  transfer  function  to  identify  space-time  structure  in  the  satellite 
sampling  pattern,  a  2-dimensional  slice  through  the  transfer  function  along  90“  azimuth 
(i.e.,  along  the  east  axis  with  zero  meridionsd  wavenumber)  is  shown  in  Figure  12  for  the 
GEOSAT  data  as  actually  sampled  by  the  satellite.  The  location  of  the  smoothed  esti¬ 
mate  for  this  example  is  a  crossover  point. 

The  low-frequency  pass  band  of  the  smoother  is  evident  as  the  plateau  region  cen¬ 
tered  at  zero  wavenumber  and  frequency.  The  interesting  characteristic  of  the  equiva¬ 
lent  transfer  function  is  the  distortion  of  the  usual  elliptical  pass  band  in  the  upper  right 
quadrant.  There  is  a  series  of  aliasing  peaks  along  a  line  of  slope  1  in  this  log-log  plot. 

It  is  easy  to  show  that  constant  phase  propagation  at  phase  speed  Cp  is  manifested  in  a 
log-log  plot  of  the  equivalent  transfer  function  as  a  line  with  slope  1  that  intercepts  the 
log/  =  0  axis  at  log  A:  =  —  log Cp.  The  -1.7  cycle/km  intercept  of  the  ridge  of  aliasing 
peaks  in  Figure  12  thus  corresponds  to  a  phase  speed  of  about  48  km/day. 

For  the  convention  used  here  (see  Eq.  (7)),  the  positive  k  and  /  in  the  right  half  of 
Figure  12  represent  eastward  propagation.  The  propagation  indicated  by  the  ridge  of 
aliasing  peaks  in  Figure  12  is  therefore  eastward.  An  eastward  propagation  of  48  km/day 
corresponds  to  144  km  eastward  propagation  in  three  days.  At  this  latitude  of  30“N,  this 
corresponds  to  the  shift  in  the  3-day  subcycle  in  the  GEOSAT  sampling  pattern  discussed 
in  section  2.4  (see  Figure  6b).  The  wavenumber-frequency  transfer  functions  for  the 
TOPEX  and  ERS-1  sampling  patterns  similarly  show  propagations  of  about  85  km/day 
eastward  and  48  km/day  westward,  respectively.  These  are  the  zonal  shifts  of  the  3-day 
subcycles  for  these  other  altimeter  satellites. 

The  physical  interpretation  of  the  propagating  aliasing  pattern  in  the  equivalent 
transfer  function  for  the  GEOSAT  sampling  pattern  is  that,  if  there  is  any  eastward 
propagating  sea  level  signal  with  spectral  energy  at  any  of  the  high-wavenumber,  high- 
frequency  peaks  along  the  aliasing  ridge,  it  will  alias  into  the  low-pass  band  of  the  loess 
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Figure  12.  A  slice  through  the  3*dimensional  frequency-wavenumber  equivalent  transfer  function  modulus 
along  the  90®  azimuth  (eastward)  for  the  GEOSAT  ground  track  pattern  as  actually  sampled  during  each 
17-day  exact  repeat  period  for  estimation  point  =  (45°W,30°N,  day  100)  and  quadratic  loess 

smoothing  parameters  {d^,dy,d,)  -  (8®,8°,35  days).  These  smoothing  parameters  were  chosen  somewhat 

aibitrarily  to  illustrate  the  eastward  propagating  aliasing  pattern  associated  with  the  3-day  subcycle  in  the 
satellite  orbit  (see  Figure  6b).  Note  that  both  axes  are  logarithmic. 


smoothed  estimate  of  the  sea  level  field.  That  is,  the  aliased  signal  will  be  indistinguish¬ 
able  from  the  low-frequency,  low- wavenumber  variability  of  interest  and  is  therefore  unde¬ 
tectable  in  the  smoothed  sea  level  fields.  On  the  other  hand,  if  there  is  no  sea  level  propa¬ 
gation  at  this  phase  speed,  then  this  aliasing  ridge  is  of  little  concern. 

4.  ERRORS  OF  THE  SMOOTHED  ESTIMATES 


The  equivalent  transfer  function  only  defines  the  filtering  properties  of  the  smoother 
for  the  specific  smoothing  parameters  selected.  Additional  information  about  the  signal 
characteristics  is  required  to  assess  the  quality  of  smoothed  fields  constructed  from  the 
irregularly  sampled  data.  The  degree  to  which  imperfections  in  the  filtering  operation 
contaminate  estimates  of  the  large-scale,  low-frequency  signals  of  interest  in  the  smoothed 
fields  depends  not  only  on  the  aliasing  patt'^rns  in  the  equivalent  transfer  function,  but 
also  on  the  spectral  energies  of  the  signal  and  noise  at  the  wavenumbers  and  frequencies 
of  aliasing  peaks  in  the  transfer  function.  The  combined  effects  of  filtering  properties  and 
signal  and  noise  characteristics  on  the  accuracy  of  the  smoothed  estimates  are  quantified 
in  this  section. 


The  smoothed  estimate  h  can  be  compared  with  an  ideal  low-pass  filtered  value, 
written  as 


M<o)=  r 

J  *00 


(8) 


where  H{f)  is  the  Fourier  transform  of  the  unsmoothed  signal  h{t)  and  P*(/;to,/c)  is 
the  complex  conjugate  of  the  transfer  function  for  the  ideal  smoothed  estimate  at  time 
to  •  This  idesJ  transfer  function  passes  all  of  the  signal  at  frequencies  lower  than  the  cutoff 
frequency  /c,  and  none  of  the  signal  at  higher  frequencies,  i.e.. 
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«2ir/to 


I/I  <  h 

otherwise. 


The  complex  transfer  function  P  thus  has  unit  modulus  for  frequencies  less  than  fc-  We 
have  found  empirically  that  the  cutoff  frequency  for  the  quadratic  loess  smoother  used  in 
this  study  is  related  to  the  half-span  of  the  smoother  by  /c  as  .  This  value  of  /c  is 
therefore  used  to  define  the  ideal  transfer  function  in  Eq.  (9). 

Because  the  measurement  errors  have  zero  mean  value,  it  can  be  seen  from  Eqs.  (5) 
and  (8)  that  the  bias  of  the  estimate  h(fo)  is 

(h(<o))  -  M<o)  =  r  AP*(/  :  toJc)HU)df,  (10 

J  —  oo 

where  the  angle  brackets  denote  the  mean  value  and 

AP(/;fo,/c)  =  Pif  Jo)  -  Pif;toJc)  (11 


represents  the  imperfection  of  the  1-dimensional  equivalent  transfer  function  at  frequency 
/  for  an  estimate  at  time  to  with  low-frequency  cutoff  /c.  The  modulus  of  AP  is  shown 
schematically  by  the  hatched  region  in  Figure  13.  The  bias  given  by  EJq.  (10)  can  be  in¬ 
terpreted  as  the  error  of  the  estimate  h(to)  in  the  absence  of  any  measurement  errors. 
The  bias  thus  focuses  attention  on  errors  introduced  solely  by  the  irregular  sampling  dis¬ 
tribution. 


Figure  13.  A  schematic 
representation  of  the  imperfections 
of  the  linear  smoother  given  by  Eq. 
(1 1)  (hatched  region). 


Frequency 

In  order  to  express  the  imperfections  of  the  filtering  operation  in  terms  of  the  spec¬ 
tral  characteristics  of  the  signal  h{t),  we  write  the  integral  in  Eq.  (10)  in  the  limiting  form 

()i)-S=  Urn  f;  Af-(/,)ff(/„)«/.  (12) 

d/— ♦(}  ^ 

ns— OO 

For  convenience,  the  explicit  dependencies  on  fq  and  fc  have  been  dropped  in  Eq.  (12). 
The  expected  squared  bias  is 

([(A)-S1')=  Um  f;  f;  ))«/<,.  (13) 

«i-.o  n=— OO  m=— oo 
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Because  h{t)  is  assumed  to  be  a  stationary  stochastic  process,  it  is  easy  to  show  that 

(14) 

(see,  for  example,  Priestley,  1992,  p.  249).  The  expected  squared  bias  then  reduces  to 

OO 

([(i)-I|*)=  to  Y,  |AP(/,)|"Sk(/„)«/,  (15) 

n=— OO 

where 

«/,(/»)=  (16) 

is  the  power  spectral  density  of  the  random  process  h{t)  (Priestley,  1992,  p.  208).  In  the 
limit,  Eq.  (15)  becomes  the  integral 

(((/.) r|AP(/)|“s,(/)<l/.  (17) 

^  OO 


The  expected  squared  bias  (ESB)  given  by  Elq.  (17)  describes  the  combined  effects 
of  the  signal  spectral  energy  and  the  equivalent  transfer  function  on  the  accuracy  of  the 
smoothed  estimate  h(to).  At  frequencies  /  where  either  the  aliasing  |AjP(/)|  or  the 
signal  energy  S/i(/)  are  small,  the  integrand  in  Eq.  (17)  is  small  and  consequently  con¬ 
tributes  little  to  the  ESB.  Aliasing  at  frequencies  where  |AP(/)j  is  large  is  therefore  of 
little  concern  if  the  corresponding  signal  spectral  energy  Shif)  is  weak. 

The  ESB  as  a  measure  of  the  accuracy  of  the  estimate  h{to)  can  be  compared  with 
the  mean  squared  error  that  is  more  traditionally  used  to  assess  the  quality  of  an  esti¬ 
mate.  For  a  given  realization  of  the  stochastic  process  h{t),  the  mean  squared  error  can 
be  decomposed  into  the  sum  of  the  squared  bias  and  the  variance.  The  expected  value 
of  the  mean  squared  error  over  the  ensemble  of  realizations  of  the  process  (the  EMSE)  is 
therefore  the  sum  of  the  ESB  given  by  Eq.  (17)  and  the  variance  of  the  estimate.  By  the 
same  method  used  to  derive  Eq.  (17),  it  is  easy  to  show  from  Eq.  (5)  that  the  variance  of 
the  estimate  is  ^ 

{[{h)-hY)=l  |AF(/)|'5.(/)d/,  (18) 

•'-OO 

where  St{f)  is  the  power  spectral  density  of  the  measurement  errors.  The  variance  of 
the  smoothed  estimate  thus  describes  the  combined  effects  of  the  spectral  characteris¬ 
tics  of  the  measurement  errors  and  the  equivalent  transfer  function  on  the  accuracy  of  the 
smoothed  estimate  h{to). 

The  present  study  is  primarily  concerned  with  the  limitations  imposed  by  the  sam¬ 
pling  design,  regardless  of  the  measurement  errors.  In  the  extreme  case  of  no  measure¬ 
ment  errors,  the  variance  of  the  smoothed  estimate  is  zero  and  the  EMSE  is  just  the  ESB. 
Then  all  of  the  errors  in  the  smoothed  estimate  arise  from  the  sampling  design.  For  a  rea¬ 
sonably  large  signal-to-noise  variance  ratio  (greater  than  1)  and  a  sufficiently  dense  sam¬ 
ple  design  (i.e.,  a  well  behaved  equivalent  transfer  function  P),  the  ESB  is  generally  much 
larger  than  the  variance.  Then  the  EMSE  can  be  approximated  as  just  the  ESB.  For  the 
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altimeter  applications  of  interest  in  this  study,  the  signal- to- noise  variance  ratio  is  large 
enough  that  the  variance  contribution  to  the  EMSE  can  be  neglected.  We  therefore  re¬ 
strict  attention  to  the  ESB  as  a  measure  of  the  accuracy  of  the  smoothed  estimates. 

The  mean  squared  error  formalism  is  easily  extended  to  three  dimensions.  The  ESB 
for  one  dimension  (Eq.  (17))  then  becomes 

—  O 

([(Mxo,yo,<o))-/»(a:o,J/o,/o)]  )=  /  /  /  \APySHik,l,f)dkdldf,  (19) 

J— OO  J  —  oo  J  — oo 


where 

AP  =  P(fc,l,/;xo,yo,<o)  -  PikJ,f-,xo,yo,io,kc.,lcJc)  (20) 

represents  the  imperfections  of  the  3-dimensional  equivalent  transfer  function  P  compared 
with  the  3-dimensional  transfer  function  P  of  the  ideal  smoother.  Determination  of  the 
ESB  of  a  3-dimensional  smoothed  estimate  thus  requires  knowledge  of  the  3-dimensional 
wavenumber-frequency  spectrum  Sh(k,  /,  /)  of  the  signal. 


In  multiple  dimensions,  the  smoother  weights  Oj  for  observations  g{xj,yj,tj)  on  a 
sufficiently  dense  and  regularly  spaced  sample  grid  depend  only  on  the  distance  r  from 
the  estimation  location  (xoil/oi^o)-  For  the  quadratic  loess  smoother  with  half  spans  dx, 
dy  and  du  this  distance  is  defined  by 


r2  = 


(21) 


The  Fourier  transform  of  an  elliptically  symmetric  function  is  also  elliptically  symmet¬ 
ric  (Bracewell,  1978,  p.  244).  It  is  therefore  appropriate  to  use  an  ellipitically  symmetric 
ideal  transfer  function  for  the  multidimensional  bias  calculation,  i.e., 


P{k,h  f;xo,yo,to,kc,lc, 


g-i2n(kxo+lyo-/to) 

0 


(k/k,f  +  WI,f  +  U/f,f<l  (22) 

Otherwise. 


As  before,  the  low-pass  wavenumber  and  frequency  cutoffs  kg,  Ic  and  fc  for  the  quadratic 
loess  smoother  are  approximately  the  reciprocal  of  the  half  spans  in  each  dimension. 

In  three  dimensions,  evaluation  of  the  triple  integral  in  Eq.  (19)  by  the  usual  quadra¬ 
ture  methods  is  computationally  intensive.  For  this  study,  these  integrals  were  estimated 
using  a  weighted  Monte  Carlo  method  that  is  based  on  sampling  the  region  of  integration 
at  discrete  sample  points  distributed  with  a  probability  density  proportional  to  the  signal 
spectral  energy  (Press  et  al.,  1992,  p.  306). 

The  power  spectral  density  and  the  autocovariance  function  of  the  signal  are  Fourier 
transform  pairs  (Priestley,  1992,  p.  211).  The  spectral  properties  of  the  signal  can  there¬ 
fore  be  specified  directly  or  can  be  computed  from  a  specified  autocovariance  function 
(equivalent  to  specifying  the  signal  variance  and  autocorrelation  function).  The  need 
to  specify  the  signal  variance  (which  varies  geographically  for  the  sea  level  fields  of  in¬ 
terest  in  this  study)  can  be  sidestepped  by  considering  the  relative  expected  squared 
bias  (RESB),  defined  to  be  the  ESB  given  by  Eq.  (19)  normalized  by  the  signal  variance 
(t\.  For  the  applications  considered  in  section  5,  the  signal  spectral  shape  Sh(k,l,f)/o\ 
needed  to  evaluate  the  relative  accuracy  of  the  smoothed  estimate  h{to)  by  the  RESB  was 
obtained  from  the  Fourier  transform  of  the  specified  signal  autocorrelation  function. 


SATELLITE  ALTIMETRY 


83 


5.  RESOLUTION  CAPABILITY 
5.1.  A  1-DimensionaI  Example 

The  philosophy  adopted  here  to  define  the  resolution  capability  of  an  irregularly 
spaced  data  set  is  easily  demonstrated  by  a  simple  1-dimensional  example.  A  densely 
sampled  synthetic  high-frequency  time  series  with  unit  variance  is  shown  in  Figure  14a. 
The  details  of  how  this  time  series  was  generated  are  not  important  to  this  discussion. 
The  effects  of  nonuniform  sampling  of  this  time  series  are  illustrated  by  sampling  the 
time  series  in  Figure  14a  with  periodic  bursts  of  closely  spaced  observations,  separated 
by  intervals  of  coarsely  spaced  observations.  This  sampling  strategy  is  intended  to  be  a  1- 
dimensional  analog  of  the  sampling  characteristics  of  altimeter  data,  which  are  character¬ 
ized  by  dense  2-dimensional  sampling  at  crossover  points  and  sparse  coverage  elsewhere. 
Two  different  loess  smoothed  time  series  were  constructed  from  the  unequally  spaced  ob¬ 
servations  to  show  how  the  ESB  Eq.  (17)  can  be  used  to  select  good  smoothing  parame¬ 
ters  for  the  linear  estimates. 


Figure  14.  a)  A  high-frequency 
synthetic  time  series. This  time 
series  was  observed  in  bursts  of 
sample  interval  0.2  separated  by 
sparse  observations  at  sample 
interval  2.0;  b)  a  quadratic  loess 
smoothed  time  series  constructed  at 
intervals  of  0.2  using  half  ^ns  of 
d,  =  0.6  during  the  bursts  of 
closely  spaced  observations  and  d, 
=  30  during  the  periods  of  coarsely 
spaced  observations;  c)  a  quadratic 
loess  smoothed  time  series 
constructed  at  intervals  of  0.2 
using  a  fixed  half  span  of  d,  =  30 
eveiywhere. 


In  the  first  example  (Figure  14b),  the  smoothing  parameters  dt  of  the  loess  estimates 
were  chosen  to  maximize  the  information  content  of  the  observaticms.  A  small  value  of 
dt  was  used  during  the  bursts  of  closely  spaced  observations  and  a  larger  value  of  dt  was 
used  during  the  periods  of  coarsely  spaced  observations.  As  noted  in  section  3.2,  the  low- 
pass  frequency  cutoff  of  the  loess  smoother  is  /c  »  df  * .  Consequently,  the  spectral  con¬ 
tent  of  the  loess  estimates  in  the  coarsely  sampled  periods  is  restricted  to  lower  frequen- 


84 


CHELTON  AND  SCHLAX 


cies  than  in  the  burst  periods.  The  resulting  nonstationarity  of  the  smoothed  time  series 
is  readily  apparent  from  Figure  14b.  Another  undesirable  characteristic  of  the  smoothed 
time  series  is  the  nonstationary  ESBs  of  the  loess  smoothed  estimates,  which  vary  from 
negligibly  small  in  the  burst  periods  to  0.02  in  the  coarsely  sampled  periods. 

In  the  second  example  (Figure  14c),  the  smoothed  time  series  was  constructed  by 
fixing  the  loess  smoothing  parameter  throughout  the  record  to  the  large  value  used  in 
Figure  14b  in  the  coarsely  sampled  periods.  This  is  equivalent  to  sacrificing  the  higher 
resolution  capability  in  the  burst  periods  (i.e.,  “over smoothing”  the  data).  However,  the 
benefits  of  this  procedure  are  apparent  from  Figure  14c;  the  spectral  content  of  the  result¬ 
ing  smoothed  time  series  is  stationary.  In  addition,  the  ESBs  of  the  loess  estimates  are 
uniform  (0.02)  throughout  the  record. 

The  need  to  degrade  the  higher  resolution  possible  in  the  burst  regions  is  disappoint¬ 
ing.  However,  for  analysis  of  the  full  record  of  unequally  spaced  observations,  the  homo¬ 
geneously  smooth  time  series  in  Figure  14c  is  much  more  desirable  than  the  nonstationary 
time  series  in  Figure  14b.  If  the  interest  is  in  the  higher  frequency  variability  that  can  be 
resolved  in  the  burst  periods,  then  the  analysis  must  be  restricted  to  just  the  burst  peri¬ 
ods.  Then  the  longer-period  information  content  of  the  full  data  set  is  lost  by  sacrificing 
the  coarsely  sampled  periods  of  the  data  record. 

The  philosophy  for  choosing  the  appropriate  smoothing  parameter  is  therefore 
to  smooth  the  data  to  the  resolution  that  is  possible  in  the  sparsely  sampled  regions. 

This  can  be  achieved  by  selecting  a  single  smoothing  parameter  for  the  entire  data  set 
that  yields  a  uniform  ESB  at  every  location  at  which  a  smoothed  estimate  is  to  be  con¬ 
structed.  The  spectral  content  of  the  resulting  smoothed  time  series  will  be  stationary. 

5.2.  The  GEOSAT  Ground  Track  Pattern  Sampled  Synoptically 

Extension  of  the  results  of  section  5.1  to  two  spatial  dimensions  further  emphasizes 
the  importance  of  degrading  the  resolution  capability  in  densely  sampled  regions.  As  in 
section  3,  the  full  3-dimensional  characteristics  of  altimeter  sampling  are  more  easily  un¬ 
derstood  if  time  dependence  is  first  neglected  and  synoptic  sampling  of  the  ground  track 
pattern  in  Figure  Ca  is  considered.  Near  the  crossover  points,  this  sample  grid  is  capa¬ 
ble  of  providing  detailed  maps  of  mesoscale  variability.  However,  along  the  ground  tracks 
connecting  crossover  points  and  in  the  unsampled  diamond  regions  in  between,  only  the 
larger  scale  variability  can  be  resolved.  A  map  constructed  with  the  highest  resolution 
possible  at  each  location  (analogous  to  the  1-dimensional  case  in  Figure  14b)  would  con¬ 
sist  of  a  patchwork  quilt  of  eddies  and  meanders  near  the  crossover  points  and  smooth, 
large-scale  variability  elsewhere. 

These  effects  can  be  quantified  in  terms  of  the  RESB.  The  2-dimensional  wavenum¬ 
ber  spectrum  of  sea  level  must  be  specified  to  obtun  the  RESB.  Analyses  of  dynamic 
height  fields  from  hydrographic  data  provide  useful  guidance.  Shen  et  al.  (1986),  Carter 
and  Robinson  (1987)  and  other  studies  have  found  that  the  spatial  structure  of  the  sea 
level  held  can  be  approximated  by  an  isotropic  Gaussian  autocorrelation  function  of  the 
form 

p(r)  =  ,  (23) 

where  r  is  distance  and  the  spatial  scale  ro  is  approximately  50  km.  This  spatial  scale  is 
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consistent  with  independent  estimates  of  p(r)  computed  directly  from  altimeter  data  (Le 
Traon  et  al.,  1990).  The  normalized  1-dimensional  wavenumber  spectrum  of  sea  level  for 
computing  the  RESB  from  £q.  (17)  was  obtained  analytically  from  the  Fourier  transform 
of  this  Gaussian  autocorrelation  function, 

=  v^roe-< (24) 

(Bracewell,  1978,  p.  130),  where  is  the  (unspecified)  sea  level  variance. 

The  RESB  was  computed  from  the  GEOS  AT  sample  grid  for  a  range  of  loess 
smoothing  parameters  dx  and  dy  at  three  estimation  locations:  a  crossover  point,  a  di¬ 
amond  center,  and  a  point  along  a  ground  track  midway  between  two  crossover  points 
(referred  to  here  as  a  midpoint).  A  contour  plot  of  the  RESBs  for  the  midpoint  is  shown 
in  Figure  15.  It  is  evident  from  this  figure  that  there  is  no  unique  choice  of  smoothing 
parameters  for  a  particular  RESB;  a  given  RESB  can  be  obtained  with  high  meridional 
resolution  and  low  zonal  resolution,  with  low  meridional  resolution  and  high  zonal  reso¬ 
lution,  or  by  compromising  to  obtain  moderate  resolution  in  both  dimensions.  The  ap¬ 
proximate  2-to-l  aspect  ratio  of  the  contours  indicates  that  a  greater  degree  of  smooth¬ 
ing  is  required  in  the  meridional  direction  than  in  the  zonal  direction  to  obtain  a  given 
RESB.  This  is  because  of  the  longer  meridional  dimensions  of  the  diamonds  formed  by 
the  ground  track  patterns. 


Figure  15.  Contour  plot  of  the  relative  expected 
squared  bias  as  a  function  of  the  longitudinal  and 
latitudinal  quadratic  loess  smoothing  parameters 
and  dy  for  the  GEOSAT  ground  track  pattern  (see 

Figure  6a)  sampled  synoptically  for  an  estimate  at  a 
crossover  point. 


2  5  8  II 

(deg  longitude) 

The  simplest  form  of  spatial  smoothing  is  the  isotropic  smoother  for  which  dx  =  dy  = 
d,.  Isotropic  smoothing  is  used  in  Figure  16  to  illustrate  the  geographical  vairiability  of 
the  RESB.  The  three  curves  represent  the  RESB  as  a  function  of  d,  for  the  three  estima¬ 
tion  points  considered.  At  the  shortest  smoothing  scale  of  d,  =  2“,  the  RESB  is  highest 
at  the  diamond  center  and  lowest  at  the  midpoint.  At  both  of  these  locations,  the  RESB 
decreases  monotoiucally  as  the  smoothing  parameter  d,  increases,  converging  at  about 
dt  =  4".  Curiously,  the  RESB  at  the  crossover  actually  increases  as  d,  is  increased  from 
2°  to  2.5‘'and  then  decreases  monotonically  for  larger  d,. 
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Figure  16.  The  relative  expected 
squared  bias  as  a  function  of 
isotropic  quadratic  loess  smoothing 
parameter  d,  for  the  GEOSAT 
ground  track  pattern  sampled 
synoptically  for  estimates  at  a 
crossover  location  (solid  line),  a 
diamond  center  (dashed  line)  and 
along  a  ground  track  at  the 
midpoint  between  two  crossovers 
(dotted  line). 


The  behavior  of  the  RESB  at  the  crossover  is  counter-intuitive.  For  very  small  dg 
(not  shown  in  Figure  16),  the  RESB  is  smaller  at  the  crossover  than  at  the  midpoint. 
However,  for  d,  =  2°,  the  RESB  is  lower  at  the  midpoint  because  the  region  of  influence 
about  the  midpoint  then  includes  observations  on  the  neighboring  ground  tracks  from 
a  wide  range  of  directions.  In  comparison,  the  region  of  influence  about  the  crossover 
for  dg  =  2®  includes  data  only  from  the  two  diagonal  ground  tracks  passing  through 
the  crossover;  observations  are  not  available  from  the  regions  directly  north,  south,  east 
and  west  of  the  crossover.  As  the  span  further  increases  to  d,  =  2.5®,  the  lack  of  zonal 
and  meridional  constraints  on  the  2-dimensional  smoothed  estimate  at  the  crossover  be¬ 
comes  more  significant,  further  increasing  the  RESB.  When  d,  exceeds  2.5®,  the  region 
of  influence  for  the  crossover  estimate  becomes  large  enough  to  include  zonally  adjacent 
crossovers  and  neighboring  crossovers  along  the  ground  tracks  that  intersect  at  the  es¬ 
timation  point.  The  2-dimensional  field  is  then  well  resolved  in  all  directions  and  the 
RESB  of  the  crossover  estimate  begins  to  decrease  with  increasing  d,. 

The  important  point  made  by  Figure  16  is  that  the  RESB  is  not  homogeneous  over 
the  map  for  small  spatial  smoothing  parameter  d,.  According  to  the  criterion  outlined 
in  section  5.1,  the  best  value  of  d,  for  loess  estimates  constructed  at  an  arbitrary  loca¬ 
tion  (xo»yo)  is  the  smallest  value  that  gives  spatially  homogeneous  RESB.  On  the  basis 
of  Figure  16,  this  is  about  5®,  which  is  the  value  of  d,  at  which  the  RESB  curves  for  the 
three  estimation  locations  converge.  Such  a  large  degree  of  smoothing  is  somewhat  overly 
pessimistic,  however,  since  this  is  larger  than  the  dimensions  of  the  GEOSAT  diamonds. 
The  same  RESB  can  be  obtained  at  a  somewhat  higher  resolution  if  estimates  are  con¬ 
structed  only  at  the  crossover  points.  Moreover,  when  the  asynopticity  of  the  sampling 
of  the  GEOSAT  grid  is  considered  (see  section  5.3),  temporal  smoothing  can  be  used  to 
further  increase  the  spatial  resolution  capability. 

5.3.  The  3-Dimensional  GEOSAT  Data  Set 

The  RESB  is  a  complicated  function  of  time  and  geographical  location  when  the 
asynoptic  sampling  characteristics  of  the  GEOSAT  ground  tracks  are  considered.  The 
wavenumber-frequency  spectral  shape  for  determining  the  RESB  from  Eq.  (19)  was  de¬ 
rived  by  assuming  a  Gaussian  space-time  autocorrelation  function 
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p(r,r)  =  e-< (25) 

where  r  is  the  isotropic  spatial  lag  as  in  section  5.2  and  r  is  the  time  la^.  The  spatial  and 
temporal  scales  were  chosen  to  be  tq  =  50  km  and  tq  =  30  days.  This  form  is  consistent 
with  the  space-time  autocorrelation  function  derived  from  dynamic  height  data  (Shen  et 
al.,  1986;  Carter  and  Robinson,  1987).  The  corresponding  normalized  power  spectral  den¬ 
sity  for  computing  the  RESB  is 

=  TT  roro  .  (26) 


The  RESB  at  a  crossover  point  is  contoured  in  Figure  17  for  a  range  of  temporal 
and  isotropic  spatial  smoothing  parameters  dt  and  d,  at  two  different  times  during  the 


Figure  17.  The  relative  expected  squared  bias  as  a 
function  of  the  temporal  and  isotropic  spatial  loess 
smoothing  parameters  d,  and  d,  for  the  GEOSAT 
ground  track  pattern  as  actually  sampled  during  each 
17-day  exact  repeat  period.  The  two  panels 
correspond  to  estimates  at  a  crossover  location  on  a) 
day  2;  and  b)  day  11. 
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GEOSAT  17-day  repeat.  The  plot  for  day  2  (Figure  17a)  corresponds  to  a  time  when 
both  ground  tracks  passed  through  this  crossover  within  a  24-hour  period.  The  RESB 
is  therefore  generally  small.  The  peculiar  behavior  for  small  dt  and  d,  is  a  more  com¬ 
plex  manifestation  of  the  radius  of  influence  problem  discussed  in  section  5.2  (see  Fig¬ 
ure  16).  When  d,  =  2“,  the  RESB  first  increases  with  increasing  dt  until  dt  »  15  days 
and  then  decreases  monotonically  for  larger  dt-  This  is  because  short  temporal  smooth¬ 
ing  is  well  resolved  near  the  time  when  both  ground  tracks  sample  the  crossover  point. 
The  smoothed  sea  level  field  over  longer  15-day  periods  is  not  as  well  resolved  because 
of  the  long  interval  between  GEOSAT  sampling  of  neighboring  ground  tracks.  The  tem¬ 
poral  half  span  must  be  increased  to  more  than  15  days  for  the  radius  of  influence  of  the 
3-dimensional  quadratic  loess  smoother  to  become  large  enough  to  include  observations 
from  neighboring  ground  tracks,  thereby  decreasing  the  RESB. 

A  similar  effect  occurs  as  a  function  of  d,  when  dt  is  small.  When  dt  —  10  days,  the 
RESB  initially  decreases  with  increasing  d,  until  dj  «  3.8°.  For  larger  d,,  the  RESB  first 
increases  until  d,  w  4.5°  and  then  decreases  monotonically  for  larger  d,.  This  effect  is 
related  to  the  complex  temporal  structure  of  the  GEOSAT  sampling  of  nearby  ground 
tracks.  For  dt  =  10  days,  the  spatial  structure  of  the  low-pass  filtered  sea  level  field  is  not 
well  resolved  when  d,  =  4.5°;  the  smoothed  sea  level  field  at  this  time  and  location  is  bet¬ 
ter  resolved  in  the  quadratic  loess  smoothed  estimate  by  either  decreasing  or  increasing 
the  degree  of  spatial  smoothing. 

The  contour  plot  of  RESB  for  day  11  (Figure  17b)  is  much  simpler  than  that  for 
day  2.  The  RESB  decreases  monotonically  with  increasing  d<  and  d«  over  the  full  ranges 
of  these  smoothing  parameters.  At  day  11,  this  crossover  point  and  the  neighboring 
ground  tracks  are  not  sampled  at  nearby  times.  A  much  greater  degree  of  smoothing 
(spatially  or  temporally)  is  therefore  required  than  on  day  2  to  achieve  a  given  value  of 
RESB. 

The  spatial  and  temporal  inhomogeneity  of  the  RESB  evident  from  Figure  17  com¬ 
plicates  selection  of  a  good  combination  of  the  smoothing  parameters  d,  and  dt.  A  given 
value  of  the  RESB  can  be  achieved  at  any  particular  estimation  time  to  by  trading  off  d, 
against  dt-  However,  the  RESB  for  a  specific  choice  of  these  smoothing  parameters  will, 
in  general,  differ  for  different  estimation  times  to- 

The  temporal  variability  of  the  RESB  at  this  crossover  location  during  a  17'day 
GEOSAT  repeat  period  is  shown  in  Figure  18  for  six  different  combinations  of  d,  and 
dt.  For  small  dt,  the  RESB  is  rather  erratic  and  varies  by  more  than  an  order  of  magni¬ 
tude  over  the  17-day  repeat  period  unless  d,  is  very  large  (see  the  examples  for  (ds,dt)  = 
(4,10)  and  (8,10));  the  RESB  is  generally  large  with  localized  decreases  at  times  when 
there  are  GEOSAT  ground  tracks  nearby.  When  the  temporal  span  dt  is  increased  to  val¬ 
ues  larger  than  the  17-day  repeat  period,  the  radius  of  influence  of  the  quadratic  loess 
smoother  includes  sufficient  data  to  yield  well-behaved  time  series  of  the  RESB.  For 
this  particular  crossover  location,  the  RESB  tends  to  have  a  minimum  at  day  2  with 
maxima  at  days  7  and  14  separated  by  a  local  minimum  at  day  11  (see  the  example  for 
(da,  dt)  =  (2,20)).  These  features  of  the  RESB  time  series  reflect  the  temporal  distribu¬ 
tion  of  ascending  and  descending  ground  tracks  in  the  vicinity  of  the  estimation  location. 

The  time  series  of  RESB  at  other  crossover  locations  exhibit  similar  periodic  vari¬ 
ations  over  each  17-day  repeat  period.  The  timing  of  the  minima  and  maxima  vary,  de- 
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Figiue  18.  The  relative  expected 
squared  bias  as  a  function  of  time 
at  a  GEOSAT  crossover  location 
for  six  different  choices  of  the 
temporal  and  isotropic  spatial  loess 
smoothing  pai  ■.  meters  and  . 


pending  on  the  temporal  distribution  of  GEOSAT  data  near  the  particular  crossover  loca¬ 
tion. 

Two  of  the  combinations  of  d,  and  dt  in  Figure  18  are  of  particular  interest.  When 
(d„  dt)  =  (3,30)  the  RESB  is  approximately  a  constant  value  of  0.1  over  the  17-day 
repeat  period.  When  (dj,dt)  =  (4,30),  the  RESB  is  about  0.05  and  is  also  constant 
over  the  17-day  period.  The  time  series  of  RESB  at  other  crossover  locations  are  simi¬ 
larly  approximately  constant  over  the  17-day  repeat  period.  By  the  criterion  outlined  in 
section  5.1,  either  of  these  would  therefor^  be  good  choices  for  d,  and  dt.  The  choice  of 
{dt,dt)  =  (4,30)  is  the  more  conservative  of  the  two  as  it  yields  an  RESB  that  is  about 
half  as  large.  Note  that  while  the  RESB  for  (d,,dt)  =  (4,20)  is  everywhere  smaller  than 
that  for  (3,30),  it  varies  by  a  factor  of  two  over  the  17-day  period.  As  discussed  in  sec¬ 
tion  5.1,  this  temporally  inhomogeneous  RESB  is  less  desirable  than  tolerating  the  some¬ 
what  higher  RESB  for  (d,,(it)  =  (3,30). 

On  the  basis  of  Figure  18,  we  conclude  that  the  GEOSAT  sampling  pattern  is  ca¬ 
pable  of  resolving  the  spatial  and  temporal  characteristics  of  sea  level  variability  on 
monthly  and  longer  time  scales.  The  spatial  resolution  of  monthly  maps  constructed  from 
GEOSAT  data  is  3°  or  4®,  depending  on  how  liberal  one  chooses  to  be  about  the  degree 
of  RESB  that  is  tolerable.  The  effects  of  measurement  errors  and  data  dropouts  have 
been  neglected  in  this  analysis. 

It  should  be  noted  that  the  RESB  for  these  choices  of  loess  smoothing  parameters 
are  generally  larger  and  not  necessarily  constant  temporally  over  the  17-day  repeat  pe¬ 
riod  at  locations  other  than  at  crossover  points.  A  greater  degree  of  spatial  or  temporal 
smoothing  would  therefore  be  necessary  for  quadratic  loess  estimates  at  these  other  loca¬ 
tions.  However,  the  spatial  smoothing  parameters  of  d,  =  3°  or  4“  are  large  enough  that 
the  spatial  dimensions  of  the  smoothed  estimates  at  neighboring  crossover  locations  al¬ 
ready  overlap.  Consequently,  estimates  at  only  the  crossover  points  are  adequate  for  con¬ 
structing  maps  of  the  sea  level  field  and  there  is  no  need  to  estimate  the  smoothed  field  at 
any  other  locations. 
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6.  DISCUSSION  AND  CONCLUSIONS 

The  summary  of  past  altimeter  studies  in  section  2  showed  that  determination  of  ab¬ 
solute  sea  surface  topography  by  satellite  altimetry  is  presently  limited  by  uncertainties 
in  the  marine  geoid  and  orbit  height.  As  a  result  of  significant  improvements  in  precision 
orbit  determination,  orbit  errors  are  now  less  than  10  cm,  which  has  greatly  extended  the 
utility  of  altimeter  data.  The  marine  geoid  will  continue  to  be  the  limiting  factor  until 
a  dedicated  gravity-mapping  satellite  can  be  launched  to  map  the  global  marine  geoid 
with  an  accuracy  of  a  few  centimeters  on  scales  of  ~50  km  and  longer.  The  accuracy  of 
presently  available  estimates  of  the  marine  geoid  is  ~30  cm  overall.  Because  the  longer 
scales  of  the  marine  geoid  are  known  most  accurately,  accurate  estimates  of  the  mean  sea 
surface  topography  are  limited  to  only  very  large  spatial  scales. 

Uncertainties  in  the  marine  geoid  and  orbit  height  become  much  less  important  if 
interest  is  restricted  to  studies  of  sea  level  variability:  the  marine  geoid  is  eliminated  be¬ 
cause  it  is  time  invariant  over  the  duration  of  a  satellite  mission,  and  time-dependent  or¬ 
bit  errors  can  essentially  be  eliminated  by  simple  statistical  techniques.  Numerous  stud¬ 
ies  have  shown  that  variance  statistics  can  be  reliably  computed  from  altimeter  data  and 
analyzed  to  study  ocean  variability  on  geographical  scales  that  cannot  be  addressed  by 
other  data  sets.  The  global  geographical  distribution  and  dynamics  of  eddy  variability 
have  been  investigated  from  sea  level  variances  and  wavenumber  spectra  derived  from  al¬ 
timeter  data.  The  anisotropy  of  surface  velocity  variability  near  topographic  features  and 
in  the  vicinity  of  intense  currents  has  been  investigated  over  the  Southern  Ocean  from 
surface  geostrophic  velocity  variances  estimated  from  altimeter  data.  The  velocity  vari¬ 
ances  have  also  been  used  to  investigate  eddy  transfer  of  momentum  in  strong,  horizon¬ 
tally  sheared  mean  flows. 

The  variance  statistics  that  can  be  readily  obtained  from  altimeter  data  do  not  fully 
exploit  the  information  content  of  the  data.  For  many  applications,  it  is  desirable  to  map 
the  time  evolution  of  the  sea  level  field.  This  is  a  much  more  difficult  problem  as  it  re¬ 
quires  an  understanding  of  the  resolution  capability  of  altimeter  data.  To  date,  the  scales 
considered  in  studies  of  mapped  sea  level  variability  vary  widely  and  have  been  chosen 
rather  arbitrarily. 

In  this  paper,  a  method  has  been  presented  for  quantifying  the  resolution  capabil¬ 
ity  of  an  arbitrarily  sampled  data  set.  The  emphasis  has  been  on  altimeter  data,  but 
the  method  is  applicable  to  any  irregularly  sampled  data  set.  The  focus  here  is  on  deriv¬ 
ing  sea  level  fields  for  applications  such  as  descriptive  studies  of  sea  level  variability  and 
model  validation.  Ultimately,  it  may  be  possible  to  derive  higher  spatial  and  temporal 
resolution  sea  level  fields  by  combining  the  data  with  a  model  through  some  form  of  so¬ 
phisticated  data  assimilation.  Before  this  is  done,  however,  the  information  content  of  the 
data  alone  must  be  established.  The  method  here  identifies  the  scales  at  which  reliable 
sea  level  fields  can  be  derived  from  altimeter  data. 

The  starting  point  for  application  of  the  method  is  to  concede  that  a  practical  lim¬ 
itation  of  the  coarse  grid  formed  by  the  ground  track  pattern  and  asynoptic  sampling  of 
the  grid  is  that  altimeter  data  can  only  resolve  large-scale,  low-frequency  variability.  The 
data  must  therefore  be  smoothed  to  some  degree  to  reduce  the  effects  of  aliasing  of  un¬ 
resolved  variability.  The  term  aliasing  is  used  here  for  irregularly  spaced  data  in  a  more 


SATELLITE  ALTIMETRY 


91 


general  context  than  the  classical  meaning  of  the  term,  as  discussed  in  section  3.2.  The 
objective  is  to  smooth  the  data  to  the  minimum  degree  necessary,  thereby  preserving  as 
much  of  the  information  content  of  the  data  as  possible. 

The  methodology  is  based  on  the  equivalent  transfer  function,  which  is  easily  com¬ 
puted  as  the  Fourier  transform  of  the  weights  of  an  arbitrary  linear,  smoothed  estimate. 
The  equivalent  transfer  function  defines  how  the  spectral  content  of  the  observations  (sig¬ 
nal  plus  noise)  is  filtered  in  a  smoothed  estimate  of  the  sea  level  field  at  a  specific  loca¬ 
tion  in  space  and  time  and  for  a  specific  choice  of  smoothing  parameters.  The  equiva¬ 
lent  transfer  function  also  provides  an  efficient  way  to  describe  systematic  patterns  in  the 
sampling  characteristics  that  are  often  difficult  to  detect  by  other  means.  The  3-day  sub¬ 
cycle  in  the  ground  track  pattern  of  altimeter  satellites  shown  in  section  3.4  is  a  relatively 
simple  example.  A  more  complicated  example  where  the  equivalent  transfer  function  has 
proven  useful  is  in  the  determination  of  which  tidal  frequencies  can  significantly  alias  al¬ 
timeter  estimates  of  large-scale,  low-frequency  sea  level  variability  for  a  specific  orbit  con¬ 
figuration  (Schlax  and  Chelton,  1993). 

The  equivalent  transfer  function  only  identifies  the  wavenumbers  and  frequencies  at 
which  contamination  of  the  low-frequency,  low-wavenumber  scales  of  interest  is  poten¬ 
tially  a  problem.  As  such,  the  equivalent  transfer  function  is  not  sufficient  to  determine 
the  resolution  capability  of  the  irregularly  sampled  data  set.  If  there  is  no  signal  energy 
at  these  wavenumbers  and  frequencies  then  there  is  no  contamination  of  the  low-pass 
band.  The  mean  squared  error  formalism  in  section  4  quantifies  the  degree  of  contami¬ 
nation  by  combining  the  equivalent  transfer  function  and  the  signal  spectrum  to  quantify 
the  accuracy  of  the  smoothed  estimate  of  the  field  for  a  specific  location  and  a  specific  de¬ 
gree  of  smoothing.  In  practice,  the  relative  expected  squared  bias  (RESB)  contribution  to 
the  mean  squared  error  is  usually  sufficient  to  determine  the  resolution  capability  of  the 
data  set. 

The  method  thus  requires  that  the  shape  of  the  signal  spectrum  be  prescribed  a  pri¬ 
ori  in  order  to  compute  the  RESB.  For  the  sea  level  signal  of  interest  here,  the  signal  au¬ 
tocorrelation  function  was  assumed  to  be  Gaussian  in  space  and  time  with  spatial  and 
temporal  scales  of  50  km  and  30  days.  This  form  was  adopted  on  the  basis  of  indepen¬ 
dent  estimates  from  hydrographic  data.  The  RESBs  presented  here  are  pessimistic  if 
these  decorrelation  scales  are  too  short. 

The  procedure  for  determining  the  resolution  capability  is  straightforward  but  in¬ 
volves  a  large  vdume  of  information  that  must  be  examined  to  determine  the  degree 
of  smoothing  necessary  to  obtain  sensible  fields  from  the  irregularly  spaced  observa¬ 
tions.  The  wavenumber-frequency  content  of  the  linear  estimate  and  the  RESB  in  gen¬ 
eral  vary  with  the  time  and  location  of  the  estimation  point  and  with  the  specified  de¬ 
gree  of  smoothing.  The  approach  requires  determination  of  the  RESB  at  a  large  number 
of  estimation  points  for  a  wide  range  of  smoothing  parameters.  For  the  quadratic  loess 
smoother  used  here  (see  Appendix  A),  the  smoothing  parameters  are  the  half  spans  of 
the  smoother  in  the  three  dimensions.  For  the  Gauss-Markov  smoothers  discussed  in  Ap¬ 
pendix  B,  the  smoothing  parameters  are  the  correlation  time  scales  in  the  three  dimen¬ 
sions  and  the  signal-to-noise  variance  ratio. 

iVom  this  multidimensional  array  of  RESB  values  (three  dimensions  for  the  estima¬ 
tion  points  plus  three  additional  dimensions  for  the  smoothing  parameters),  the  recom- 
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mended  approach  is  to  find  a  fixed  combination  of  smoothing  parameters  that  yields  a 
spatially  and  temporally  homogeneous  held  of  RESB.  There  is  no  unique  solution  for  the 
“best”  combination  of  smoothing  parameters;  the  smoothing  parameter  in  one  dimension 
can  be  traded  off  against  smoothing  parameters  in  the  other  dimensions  to  obtain  differ¬ 
ent  resolutions  with  the  same  RESB. 

By  fixing  the  smoothing  parameters  to  the  same  values  at  all  estimation  points,  the 
wavenumber-frequency  content  of  the  estimated  field  is  spatially  and  temporally  homoge¬ 
neous.  This  is  a  rather  different  philosophy  than  is  usually  adopted  in  the  statistical  liter¬ 
ature.  Statisticians  generally  select  the  smoothing  parameter  of  a  linear  estimate  accord¬ 
ing  to  the  variance  of  the  estimate  (as  opposed  to  the  expected  squared  bias  used  here). 
The  spans  are  then  related  to  the  number  of  observations  in  a  linear  estimate,  rather 
than  to  the  physical  space  spanned  by  the  smoother.  As  shown  in  section  5.1,  this  ap¬ 
proach  causes  the  wavenumber-frequency  content  of  the  estimates  to  vary  spatially  and 
temporally,  depending  on  the  sampling  distribution  (see  also  Schlax  and  Chelton,  1992, 
section  2.3).  Fixing  the  smoothing  parameters  to  the  same  values  everywhere  yields  esti¬ 
mates  with  essentially  the  same  low-pass  band  at  all  estimation  points. 

In  general,  the  RESB  varies  with  estimation  location  when  a  fixed  combination  of 
smoothing  parameters  is  used  everywhere.  This  is  why  the  recommended  strategy  is  to 
seek  a  fixed  combination  of  smoothing  parameters  that  yields  a  spatially  and  temporally 
homogeneous  field  of  RESB.  The  resolution  capability  in  densely  sampled  areas  of  the 
data  set  is  thus  deliberately  degraded  by  “oversmoothing”  to  the  lower  resolution  that 
can  be  resolved  in  the  sparsely  sampled  areas.  The  philosophy  of  this  approach  is  that  it 
is  preferable  to  sacrifice  the  higher  resolution  that  is  possible  at  the  densely  sampled  areas 
than  to  produce  smoothed  fields  with  spatially  and  tem''orally  inhomogeneous  spectral 
content  and  RESB  that  are  purely  an  artifact  of  the  data  distribution  and  smoothing  pro¬ 
cedure. 

If  the  interest  is  in  short-scale  variability,  then  low-pass  filtering  by  the  recommended 
approach  is  clearly  undesirable.  These  shorter  scales  of  variability  can  be  retained  as  long 
as  attention  is  restricted  to  the  areas  of  the  data  record  where  they  are  adequately  re¬ 
solved.  If  the  entire  data  set  is  to  be  analyzed  as  a  single  record,  then  the  data  must  be 
low-pass  filtered  to  retsdn  only  the  long  scales  that  are  resolved  everywhere  in  the  data 
set. 

Application  of  the  method  to  the  GEOSAT  data  set  concludes  that  the  spatial  and 
temporal  scales  of  sea  level  variability  that  can  be  resolved  are  about  3°  or  4°  of  latitude 
and  longitude  by  about  30  days  for  estimates  constructed  at  the  crossovers  of  ascending 
and  descending  ground  tracks.  At  shorter  spatial  and  temporal  scales,  the  RESB  of  the 
smoothed  estimates  varies  substantially  over  the  GEOSAT  17-day  repeat  period  and  with 
location  in  the  grid  of  crossovers. 

It  should  be  kept  in  mind  that  the  estimates  of  sampling  errors  presented  in  sec¬ 
tion  5  neglect  the  effects  of  measurement  errors.  Residud  orbit  errors  in  GEOSAT  data 
are  likely  to  render  the  resolution  capability  deduced  here  somewhat  optimistic.  The  es¬ 
timates  of  sampling  error  are  also  based  on  the  nominal  GEOSAT  sampling  pattern  and 
thus  assume  complete  data  coverage.  Because  of  problems  with  GEOSAT  attitude  con¬ 
trol,  seasonally  varying  data  dropouts  at  middle  and  high  latitudes  were  common  along 
descending  ground  tracks  in  the  northern  hemisphere  and  ascending  ground  tracks  in  the 
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southern  hemisphere  (see  Cheney  et  al.,  1988,  Figure  2).  This  data  loss  also  renders  the 
GEOSAT  resolution  capability  deduced  here  overly  optimistic  at  locations  and  times  of 
significant  data  dropouts. 

The  resolution  capability  of  3°  or  4°  by  30  days  is  adequate  for  studies  of  large- 
scale,  low-frequency  sea  level  variability.  This  is  generally  too  large,  however,  for  mapping 
mesoscale  variability  such  as  short-scale  meanders  in  the  flow  and  detachment  and  subse¬ 
quent  drift  of  rings.  At  the  present  time,  there  are  two  satellite  altimeters  simultaneously 
observing  the  global  sea  level  variability.  The  ERS-1  altimeter  launched  in  July  1991  and 
the  TOPEX  altimeter  launched  in  August  1992  are  expected  to  continue  providing  useful 
data  for  several  years.  By  combining  data  from  these  two  altimeters,  it  will  be  pcwsible 
to  map  the  sea  level  variability  with  higher  spatial  and  temporal  resolution  than  can  be 
obtained  from  either  altimeter  individually.  It  is  a  straightforward  application  of  the  for¬ 
malism  presented  in  this  paper  to  quantify  the  spatial  and  temporal  resolution  capability 
of  the  combined  ERS-1  and  TOPEX  data  sets. 

APPENDIX  A.  QUADRATIC  LOESS  SMOOTHERS 


Loess  smoothers  are  discussed  extensively  by  Cleveland  and  Devlin  (1988)  and 
Schlax  and  Chelton  (1992).  The  quadratic  loess  estimate  at  time  to  is  defined  to  be  a  lo¬ 
cal  weighted  least  squares  fit  of  a  quadratic  function  of  t  to  the  N  observations  nearest 
to> 

h(t)  =  Cl  +  ojt -f  .  (A.l) 


The  smoothed  estimate  is  the  least-squares  fit  Eq.  (A.l)  evaluated  at  to*  The  coefficients 
ai ,  02  and  03  are  determined  by  minimizing  the  function 


(A.2) 


2=1 


where  W  is  the  sum  of  the  weights  Wj,  defined  by  the  bell-shaped  function 

(l-qf)3  0<q,<l 
0  qj>l 

■ 

The  parameter  dt  is  the  half-span  of  the  loess  smoother. 


(A.3a) 

(A.36) 


The  loess  smoother  formalism  is  easily  extended  to  three  dimensions,  in  which  case 
there  are  ten  least  squares  parameters  o,-  and  the  bell-shaped  weighting  function  is  ellip¬ 
soidal  with  half-spans  d^,  dy  and  dt, 


(A.4) 


The  quadratic  loess  estimate  can  be  expressed  in  the  standard  form  of  a  linear  es¬ 
timate  Eq.  (2)  by  the  impulse  response  method.  This  is  most  easily  seen  from  Eq.  (3). 
Suppose  that  the  only  observation  is  g*  =  1.  In  this  case,  g(t)  =  6{t  -  t*),  i.e.,  an  impulse 
at  time  tk.  By  the  sifting  property  of  the  Dirac  delta  function  (Bracewell,  1978,  p.  75), 
the  loess  smoothed  estimate  Eq.  (3)  then  reduces  to 
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^fc(*o)  =  p(**;*o) 

N 

=  Yl*^i(to)6{t,-ti)  (A.5) 

i=i 

=  Oki*o)- 

The  smoother  weight  for  the  observation  at  time  tk  is  therefore  the  quadratic  loess 
smoothed  estimate  Eq.  (A.5)  obtained  by  replacing  the  N  observations  with  a  single  ob¬ 
servation  that  has  unit  value  at  time  tk  and  values  of  zero  at  aU  other  observation  times. 
The  JV  smoother  weights  aj  in  Eq.  (2)  are  thus  derived  by  constructing  N  such  quadratic 
loess  estimates,  one  for  an  impulse  function  at  each  of  the  observation  times  tj. 

After  obtaining  the  weights  Oj  for  the  particular  smoothing  parameter  dt  by  the  im¬ 
pulse  response  method,  it  is  straightforward  to  determine  the  filtering  characteristics  of 
the  quadratic  loess  smoother  from  the  equivalent  transfer  function  Eq.  (6).  The  equiva¬ 
lent  transfer  functions  for  1-dimensional  quadratic  loess  smoothers  with  evenly  and  irreg¬ 
ularly  spaced  observations  are  discussed  in  section  3.2  (see  Figure  10). 

APPENDIX  B.  GAUSS-MARKOV  SMOOTHERS 

The  formalism  for  Gauss-Markov  estimation  (also  known  as  objective  analysis) 
has  been  presented  many  times  before  (e.g.,  Gandin,  1965;  Alaka  and  Elvander,  1972; 
Bretherton  et  al.,  1976;  Daley,  1991).  The  essential  elements  are  reviewed  here  to  es¬ 
tablish  a  framework  for  investigating  the  filtering  properties  of  Gauss-Markov  estimates 
through  the  equivalent  transfer  function  introduced  in  section  3.1.  The  smoother  weights 
that  minimize  the  mean  squared  error  of  the  linear  estimate  Eq.  (2)  are  given  by 

AT 

<*j(to)  =  DtjAi(fo) ,  (B*l) 

t=i 

a  result  known  as  the  Gauss-Markov  theorem.  In  Eq,  (B.l), 


A, '(to) 


{hHt)) 


=  p(to  -  U) 


(B.2) 


is  the  signal  autocorrelation  at  lag  (to  —  U)  and  Dij  is  the  t,7th  dement  of  the  inverse 
of  the  N  X  N  cross  correlation  matrix  of  the  data  observations  gj.  The  dements  of  this 
cross  correlation  matrix  are 


Dij  —  Pij  -|-  A  ^Nij  , 


(B.3) 


where 


{mnti)) 


SATELLITE  ALTIMETRY 


95 


is  the  signal  autocorrelati(»  at  lag  (t,  -  tj), 

N.-bul 

“  "  {«>>  (B.5) 

=  n(ti-tj) 

is  the  autocorrelation  of  the  measurement  errors  and  A  is  the  signal-to-noise  variance  ra¬ 
tio. 

The  linear  estimate  constructed  from  smoother  weights  given  by  E^.  (B.l)  is  optimal 
(i.e.,  has  the  lowest  mean  square  error  of  all  linear  estimates  of  the  form  Eq.  (2))  only  if 
the  true  autocorrelations  p(r),  rj(T)  (where  r  is  time  lag)  and  signal-to-noise  ratio  A  are 
used.  Moreover,  the  expected  squared  error  of  estimates  computed  by  this  formalism  are 
valid  only  if  the  correct  values  for  these  parameters  are  used.  If  these  three  parameters 
are  specified  in  a  more  arbitrary  manner  (perhaps  because  of  ignorance  of  the  true  val¬ 
ues  or  in  order  to  filter  the  data  as  described  below),  the  solution  is  more  appropriately 
referred  to  as  suboptimal  or  Gauss-Markov  estimation.  The  latter  term  will  be  used  here. 

In  order  to  investigate  the  filtering  properties  of  Gauss-Markov  estimates,  the  sig¬ 
nal  autocorrelation  function  p(r)  will  be  assumed  to  be  a  Gaussian  function  of  time 
lag,  p(t)  =  €-(’'/^o)  ^  and  the  measurement  errors  will  be  assumed  to  be  uncorrelated, 

Tj(ti  -  tj)  =  Siy  The  eqviivalent  transfer  functions  of  the  corresponding  Gauss-Markov 
estimates  for  error-free  measurements  and  signal  correlation  time  scales  of  tq  =  30  and 
60  are  shown  in  Figure  19a  for  evenly  spaced  observaticms  at  sample  interval  A  =  1.  Tae 
transfer  functions  are  characterized  by  a  flat  low-frequency  pass  band  with  unit  ampli¬ 
tude  and  a  very  sharp  cutoff  at  frequency  fc  «  The  filtering  properties  can  thus 

be  controlled  by  adjusting  the  signal  correlation  time  scale  ro,  analogous  to  adjusting  the 
half  span  dt  of  the  quadratic  loess  smoother  considered  in  Appendix  A  and  section  3.2. 
The  series  of  high  frequency  peaks  centered  on  even  multiples  of  the  Nyquist  frequency 
fs  =  (2A)~^  are  the  aliasing  peaks  discussed  in  section  3.2  for  the  loess  smoother  (see 
Figure  10a). 

The  effects  of  measurement  errors  are  shown  in  Figures  19b  and  c,  which  are  the 
equivalent  transfer  functions  of  Gauss-Markov  estimates  with  signal  correlation  time 
scale  ro  =  30  and  signal-to-noise  ratios  of  A  =  1  and  0.1,  respectively.  With  increasing 
measurement  error  variance  (decreasing  A),  the  amplitude  of  the  transfer  function  in  the 
pass  band  decreases,  the  pass  band  cutoff  frequency  fc  shifts  to  lower  frequencies  and  the 
sharpness  of  the  band-edge  rolloff  decreases.  In  the  limit  of  zero  signal-to-noise  ratio,  the 
equivalent  transfer  function  collapses  to  zero  everywhere,  corresponding  to  a  linear  es¬ 
timate  of  zero.  Note  that  the  equivalent  transfer  function  for  A  =  1  is  not  significantly 
different  from  that  of  the  quadratic  loess  smoother  shown  in  Figure  10a,  apart  from  the 
slightly  less  than  unit  value  across  the  low-frequency  pass  band. 

The  direct  incorporation  of  statistical  information  about  measurement  errors  is  an 
important  advantage  of  Gauss-Markov  estimates.  The  quadratic  loess  smoother  and  other 
commonly  used  linear  smoothers  have  near  unit  amplitude  across  the  entire  low-frequency 
pass  band,  regardless  of  the  measurement  error  characteristics.  These  other  estimates 
therefore  pass  all  of  the  low-frequency  spectral  energy  of  the  measurement  errors  as  well 
as  of  the  signal  of  interest.  As  suggested  by  Press  et  al.  (1992,  section  13.3),  the  ampli¬ 
tudes  of  the  transfer  functions  of  the  more  traditional  smoothers  can  be  reduced  in  mag- 
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nitude  to  mitigate  the  effects  of  measurement  errors  on  the  variance  of  the  linear  esti¬ 
mates. 

The  equivalent  transfer  functions  of  Gauss-Markov  estimates  for  an  example  of  ir¬ 
regularly  spaced  observations  are  shown  in  Figure  20  for  signal-to-noise  variance  ratios 
of  A  =  00,  1  and  0.1.  The  low-pass  bands  of  interest  are  essentially  the  same  as  those  of 
their  counterpart  equivalent  transfer  functions  for  uniformly  spaced  observations  shown  in 
Figure  19.  The  noisy  continuums  of  energy  in  the  transfer  functions  at  frequencies  higher 


frequency 


frequency 


Figure  19.  The  l-dimensional  equivalent  transfer  Figure  20.  The  same  as  Figure  19,  excqM  feu  an 

function  modulii  of  the  Gauss-Ktolcov  smoother  irregularly  spaced  saiiqrle  design. 

with  Gaussian  autocorrelation  function  and  an 

evenly  spaced  sample  design  for  a)  signal«to>noise 

variance  ratio  \  »  10,000  and  signal  correlation 

scales  Tq  =  30  (solid  line)  and  Tq  ^  60  (dashed  line); 

b)  X  =  1  and  T,,  ®  30;  and  c)  X  =  0.1  and  T,,  =  30.  In 

all  three  panels,  the  estimation  point  is  at  the 

midpoint  of  the  data  record. 
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than  fc  represent  the  effects  of  “aliasing”  from  the  uneven  sample  design,  as  discussed  in 
section  3.2.  The  details  of  this  aliasing  depend  on  the  particular  sample  design  and  on 
the  estimation  time  to- 

Although  it  is  not  generally  viewed  in  this  context,  it  is  apparent  from  the  equivalent 
transfer  functions  in  Figures  19  and  20  that  Gauss-Markov  estimation  can  be  considered 
as  a  low-pass  smoother.  In  this  sense,  it  is  just  like  any  of  the  other  smoothers  that  are 
commonly  used  to  low-pass  filter  a  noisy  or  irregularly  sampled  data  set.  The  low-pass 
cutoff  frequency  and  sharpness  of  the  band-edge  rolloff  of  the  equivalent  transfer  function 
are  controlled  by  appropriate  choices  of  tq  and  A.  It  should  be  noted  that  Gauss-Markov 
estimates  with  arbitrarily  prescribed  tq  and  A  are  not  the  optimal  estimate  of  low-pass 
filtered  data.  Such  an  optimal  estimate  can  be  constructed,  however,  by  an  extension  of 
the  Gauss-Markov  formalism  to  find  the  minimum  mean  squared  error  estimate  for  the 
linear  filtering  operator  applied  to  the  data. 

The  disadvantage  of  Gauss-Markov  estimates  is  the  computational  effort  required 
to  obtain  the  inverse  of  the  N  x  N  cross  correlation  matrix  needed  to  determine  the 
smoother  weights  by  Eq.  (B.l).  If  the  primary  interest  is  to  obtain  low-pass  filtered  es¬ 
timates  of  h{t)  (as  it  is  in  this  study),  this  computational  effort  is  unwarranted;  the  filter¬ 
ing  properties  of  the  quadratic  loess  smoother  considered  in  Appendix  A  and  section  3.2 
are  very  similar  to  those  of  the  Gauss-Markov  smoothers  for  realistic  signal-to-noise  ratios 
of  A  =  1  to  10.  The  computational  effort  required  for  quadratic  loess  estimates  is  much 
smaller  for  large  values  of  N. 
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ABSTRACT 

Some  physical  variables  are  natural  spatial  integrals  of  oceanic  water  motion  or  state 
properties.  Observation  of  these  variables  permits  isolation  of  physical  processes  that 
might  otherwise  be  difficult  to  examine  because  of  the  superposition  of  many  phenomena 
at  one  place.  Independent  of  a  particular  physical  model,  observations  of  such  integrating 
quantities  frequently  enable  direct  determination  of  relatedness  between  variables  at 
different  locations,  and  direct  determination  of  causality,  while  more  traditional  point 
observations  may  fail  to  find  such  relationships.  Furthermore,  integral  quantities  such  as 
volume  and  heat  transport,  which  are  now  being  studied  with  great  fervor  because  of  their 
climatic  importance,  are  likely  more  accurately  estimated  using  observations  of 
“integrating”  variables  than  using  a  set  of  point  measurements.  Examples  of  integrating 
types  of  variables,  such  as  horizontal  electric  fields,  vertical  acoustic  travel  time  and 
bottom  pressure,  are  used  to  demonstrate  the  ideas  above  with  examples  drawn  from  the 
study  of  (a)  atmospherically  forced,  mesoscale  motions,  and  (b)  the  volume  and  heat 
transports  of  the  Gulf  Stream. 

INTRODUCTION 

At  any  particular  location  in  the  oceans,  the  sub-inertial  water  motions  and  fluctuations  of 
state  properties  are  likely  to  be  due  to  a  superposition  (and,  possibly,  interaction)  of  a 
variety  of  phenomena  that  each  have  specific  and  different  balances  between  acceleration, 
advection,  Coriolis  forces,  pressure,  dissipation,  external  forcing,  and  so  on.  Time- 
dependent  boundary  layers  exist  as  a  result  of  property  fluxes  to  and  from  the  atmosphere 
and  earth.  Semi-permanent  meso-  and  gyre-scale  currents  (0(100  km)  and  0(1000  km), 
respectively)  of  the  “general  circulation”  are  forced  by  the  winds  and  property  fluxes,  and, 
through  instabilities,  produce  meso-  and  gyre-scale  variability  in  the  form  of  meandering 
currents,  coherent  vortices,  radiated  waves,  and  so  on.  Meso-  and  gyre-scale  variability 
can  also  be  directly  driven  by  the  atmosphere.  Each  of  these,  and  many  other  unlisted 
phenomena,  exist  at  a  variety  of  space  scales  for  each  time  scale,  so  that  they  overlap  each 
other  not  only  in  physical  space  and  time,  but  in  frequency  and  wavenumber  space  as  well. 
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To  decipher  the  ocean’s  physics,  it  is  often  preferable  to  examine  a  single  phenomenon  at 
a  time.  Then  one  has  to  consider  “contamination”  from  the  other  phenomena  that  would 
inhibit,  for  instance,  direct  detection  of  relatedness  among  oceanic  variables  and  between 
oceanic  and  atmospheric  variables. 

There  are  of  course  a  number  of  strategies  for  isolating  particular  phenomena  in  order  to 
study  their  kinematics  and  dynamics.  Sometimes,  time  series  of  variables  are  all  that  is 
needed  to  separate  phenomena  according  to  their  characteristic  frequencies.  Other  times, 
spatial  information  is  needed,  which  raises  the  cost  and  difficulty  of  a  field  experiment,  but 
which  allows  discrimination  of  wavenumbers  or  principal  components  and  thereby  possible 
discrimination  of  different  processes.  Frequently,  experiments  are  designed  so  that  there  is 
a  reasonable  certainty  that  the  phenomenon  to  be  studied  dominates  all  other  processes. 
However,  there  are  many  instances  when  this  cannot  be  accomplished.  In  these  cases, 
observations  are  usually  compared  with  model  output  visually,  graphically,  statistically,  or 
through  dynamical  parameter  estimation.  Such  comparisons  can  lead  to  the  identification 
of  the  quality  of  the  dynamical  hypotheses  as  a  function  of  frequency  and/or  wavenumber. 
It  is  not  unusual  for  experiments  to  be  designed  to  take  advantage  of  most  if  not  all  of  the 
strategies  above. 

The  purpose  of  this  note  is  to  point  out  that  there  now  exists  an  additional  observational 
strategy,  most  components  of  which  are  rather  new  to  oceanography,  for  isolating 
phenomena  that  are  large  scale  in  the  vertical  and/or  horizontal.  This  strategy  is  based  on 
the  measurement  of  integrating  variables.  The  spatial  filtering  inherent  in  these  variables 
frequently  enables  statistical  confirmation  of  important  large  scale  kinematic  and  dynamic 
relationships  which  might  otherwise  go  undetected  except  with  a  formidably  large  array  of 
point  measurements.  Yet,  in  deference  to  the  theme  of  this  workshop,  it  must  be 
acknowledged  that  isolating  large  scale  phenomena  does  not  imply  that  the  phenomena 
observed,  or  statistics  of  these  phenomena  estimated  from  integrating  variables,  are 
homogeneous  over  large  scales  as  well.  This  inhomogeneity  complicates,  if  not  invalidates, 
the  application  of  many  statistical  procedures  that  assume  homogeneity. 

We  define  integrating  variables  as  those  that  are  natural  spatial  integrals  of  oceanic  water 
motion  or  state  properties.  Table  1  lists  a  few  of  the  more  important  integrating  variables 
being  observed  today.  These  integrating  variables  are  ones  that  by  their  very  nature  tend 
to  filter  out  the  shorter  spatial  scale  variability.  The  techniques  we’ll  discuss  in  this  note 
are  those  whose  usefulness  is  well-established  and  which  offer  the  advantage  of  cost- 
effectiveness.  In  addition,  these  techniques  may  have  greater  accuracy  in  comparison  to 
using  a  suite  of  point  measurements  when  the  ultimate  goal  of  an  investigation  is  the 
measurement  of  an  integral  quantity  such  as  volume  transport. 
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Table  1.  Examples  of  Integrating  Variables. 


Variable 

Principal  component 
of  integrand 

References 

Horizontal  electric  fields 
at  a  point 

Conductivity-wei^ted 
horizontal  water  velocity, 
fi'om  seafloor  to  sea 
surface 

Sanford  (1971) 

Chave  &  Luther  (1990) 
Luther  et  al.  (1991) 

Voltages  across  fixed 
horizontal  distances 
(typically,  using 
abandoned  undersea 
telephone  cables) 

Conductivity-weighted 
horizontal  water  velocity 
(one  component  only), 
from  seafloor  to  sea 
surface  over  a  fixed 
horizontal  distance 

Larsen  &  Sanford  (1985) 
Larsen  (1992) 

Chave  et  al.  (1992b) 

Vertical  acoustic  travel 
time 

Inverse  sound  speed 
fi-om  seafloor  to  sea 
surface 

Watts  &  Rossby  (1977) 
Pickart  &  Watts  ( ^  990) 

Bottom  pressure 

Horizontal  water  velocity 
near  the  seafloor,  over  a 
fixed  horizontal  distance 

Brown  et  al.  (1975) 
Whitworth  &  Peterson 
(1985) 

Horizontal  acoustic 
travel  time  (acoustic 
thermometry) 

Inverse  sound  speed 
along  horizontal,  depth- 
varying  ray  paths 

Munk  &  Forbes  (1989) 

Reciprocal  acoustic 
travel  time 

Water  velocity  along 
horizontal,  depth-varying 
ray  paths 

Worcester  (1977) 
Worcester  et  al.  (1991) 

Orientation  of  the  earth’s 
axis  of  rotation 

Global  mass  distribution 
(especially  in  hydrologic 
reservoirs) 

Chao  (1988) 

Eubanks  (1^3) 

Rotation  rate  of  the  earth 

Global  atmospheric 
angular  momentum 
(principally,  fluctuations 
in  zonal  winds) 

Hide  &  Dickey  (1991) 
Eubanks  (1993) 
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Measurement  of  integrating  variables  allows  the  investigator  to  focus  immediately  on  a 
specific  region  of  wavenumber  space,  without  the  “contamination”  of  shorter  scale 
variability  that  may  depend  on  processes  other  than  the  one  being  sought.  Furthermore, 
such  restriction  of  the  wavenumber  space  may  enable  the  detection  of  properties  (like 
spatial  coherence  or  air-sea  coherence)  that  tend  to  zero  as  the  wavenumber  bandwidth 
increases  and  may  provide  more  useful  constraints  for  numerical  model  simulations  than 
do  point  measurements. 

In  the  sections  that  follow,  we  will  describe  applications  of  three  of  the  more 
underutilized,  yet  most  cost-effective,  integrating  variables  listed  in  Table  1,  including 
point  measurements  of  horizontal  electric  fields  (HEFs),  vertical  acoustic  travel  time 
(VATT),  and  bottom  pressure  (P^,).  We  will  show  how  observations  of  HEFs  and  in 
the  Barotropic,  Electromagnetic  and  Pressure  Experiment  (BEMPEX)  provided  definitive 
evidence  of  the  existence  of  gyre-scale  motions  that  are  directly  forced  by  sea  surface 
winds.  Horizontal  electric  field  data  from  the  Synoptic  Ocean  Prediction  (SYNOP) 
experiment  will  be  shown  that  suggest  the  greater  accuracy  of  these  integrating  variables 
in  estimates  of  volume  transport.  And,  we  will  outline  the  potential  utility  of  combining 
HEF  and  VATT  observations  to  obtain  nearly  direct  estimates  from  the  seafloor  of  heat 
transport  and  the  gravest  vertical  structures  of  horizontal  currents  and  temperature 
fluctuations. 


HORIZONTAL  ELECTRIC  FIELDS 

Motional  electromagnetic  induction  is  now  theoretically  well  understood  in  certain 
idealized  settings  (e  g.,  Sanford,  1971;  Chave  and  Luther,  1990).  Assuming  distant 
continental  boundaries  and  a  flat  seafloor  with  laterally  homogeneous  conductivity,  then 
for  the  low-ffequency  limit  where  the  aspect  ratio  of  ocean  currents  is  small,  where  the 
effect  of  self  induction  is  weak,  and  where  the  vertical  velocity  can  be  neglected  in 
comparison  with  the  horizontal  velocity,  it  can  be  shown  that  the  point  HEFs  are  related 
to  horizontal  water  velocity  by 

J*~ 

E^(x,y,t)  =  CF^kx<v^{x,y,t)>*  +—+N{x,y,t),  (1) 

where 

0 

J  dz  aix,  y,  z,  t)  (x,  y,z,t) 

<v,(x,y,t)>*=— - 5 -  (2) 

|dzcr(x,y,z,0 

— H 


and  is  called  the  conductivity-weighted,  vertically  averaged  (CWVA)  water  velocity; 
v^{x,y,z,t)  is  horizontal  water  velocity;  <T(x,y,z,t)  is  seawater  electrical  conductivity; 
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is  the  vertical  component  of  the  geomagnetic  field;  and  H{x,y)  is  ocean  depth.  The  scale 
factor  C  depends  on  a{z  ^  -H)  \  C  can  be  estimated  by  intercomparisons,  but  extensive 
geophysical  evidence  suggests  that  C  =  0.95  ±  0.05  almost  everywhere  in  the  deep  oceans 
(eg.,  Chave  and  Luther,  1990).  A  noise  term  N{x,y,t)  is  composed  of  externally 
produced  (i.e.,  in  the  ionosphere  and  magnetosphere)  electromagnetic  fields  that  dominate 
for  periods  shorter  than  a  few  days  but  are  negligible  at  longer  periods  (e  g.,  Chave  et  al., 
1989). 


Locally  and  non-locally  produced  electric  currents  are  represented  by  J*.  Given  usual 
oceanic  scales  of  motion  at  sub-inertial  periods  (greater  than  half  a  pendulum  day),  locally 
produced  electric  currents  are  theoretically  negligible  if  the  bottom  is  flat  (Chave  and 
Luther,  1990)  or  the  flow  is  aligned  along  isobaths  (Stephenson  and  Bryan,  1992).  Local 

generation  of  J*  may  be  sufficient  to  inhibit  accurate  estimation  of  ocean  water  currents 
with  electric  fields  only  where  the  currents  cut  across  isobaths  and  then  only  if  the 
underlying  sediments  are  relatively  non-conductive  (Larsen,  1992;  Stephenson  and  Bryan, 
1992).  Meandering  of  a  narrow  current  like  the  Gulf  Stream  can  theoretically  produce 

non-zero  J*  outside  of  the  stream  boundaries  (the  principal  example  of  non-local 

generation  of  J*),  which  theoretically  could  be  a  large  noise  relative  to  the  electric  field 
signal  induced  by  the  smaller  water  currents  there.  However,  Sanford  (1986)  has  pointed 

out  that  observations  have  shown  J*/a  to  be  small  [yielding  errors  of  0(1  cm/s)]  and 
generally  negligible.  And  our  own  work  with  the  SYNOP  data  has  shown  that  the  best 
agreement  between  the  moored  current  meter  data  and  the  horizontal  electrometer  data 

occurs  where  the  currents  are  moderate  to  weak,  resulting  in  no  detectable  J*.  Therefore, 
in  the  following,  J  is  ignored. 


Dropping  the  horizontal  dependences  and  letting  a(r,/)  equal  a  vertical  average  part  plus  a 
residual,  i.e., 

1  ° 

aiz,t)  =«T{t)  >+a(z,t),  where  <  (T(t)  >=  —  |^dz(7(z,t),  (3) 


then  Eq.  2  becomes 


<VA(t)>*=<V*(0>  + 


V 

fdza(jr,r)v*(jr,f) 

-H 

H  <  CT(t)  > 


(4) 


The  first  term  on  the  right  hand  side  (RHS)  of  Eq.  (4)  is  just  the  vertical  average  of 
horizontal  water  velocity  (or  depth-normalized  transport  per  unit  width).  The  second  term 
on  the  RHS  of  Eq.  (4)  is  a  small  contribution  because  seawater  conductivity  is  a  weak 
function  of  depth.  Note  that  \cT(z.t)  \ <  a(t)  >;  <  a(t)  >  is  always  greater  than  3  Siemens 
m->;  to  a  good  approximation,  a(z,0  =  3.C  +  0.097’(z,r)  Siemens  m-',  where  TCz.O  is 
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unless  the  horizontal  currents  are  strong  and  baroclinic  (i.e.,  have  large  vertical  shear). 
Assuming  low-frequency  motions  so  that  N{t)  can  be  ignored,  and  using  Eq.  (4),  the 
components  of  Eq  .  (1)  in  the  northern  hemisphere  become 

=  <  uit)  >  f  -  M(z.t)dz,  (5a) 

H  '„<  a{t)  > 

=  <  v(r) >  +  -^  f  v(z, t) dz,  (5b) 

H  :h<  ‘Kt)  > 

where  the  subscript  on  E  denotes  the  direction  in  which  that  term  is  positive  and  the 
superscript  indicates  the  horizontal  water  velocity  component  to  which  it  is  proportional. 
Clearly,  the  HEFs  are  integrating  variables  in  the  sense  defined  in  the  introduction,  being 
proportional  to  the  vertical  average  of  horizontal  water  velocity  plus  a  small 
“contamination”  due  to  conductivity.  The  conductivity  contribution  is  negligible  if 
conductivity  is  independent  of  depth  in  the  ocean  (as  it  is  at  very  high  latitudes)  or  if  the 
horizontal  water  velocity  has  little  vertical  shear  (as  frequently  occurs  at  mid-  to  high- 
latitudes). 

Normal  Modes 


KU) 


To  elucidate  further  the  relationships  in  Eq.  (5),  it  is  useful  to  expand  the  various 
quantities  in  terms  of  the  dynamical  normal  modes.  While  any  complete  basis  set  could  be 
used,  the  dynamical  normal  modes  have  a  vertical  dependence  that  permits  rapid 
convergence  of  the  expansions  of  horizontal  velocity  and  seawater  conductivity  and 
temperature.  Despite  the  fact  that  the  dynamical  normal  modes  are  obtained  from  the 
equations  of  motion  by  various  simplifying  assumptions  such  as  no  mean  flow  and 
linearity,  in  using  these  modes  here  we  are  not  making  any  assumptions  about  the 
underlying  dynamics  of  the  quantities  being  observed.  The  modes  are  simply  the  most 
convenient  basis  set  for  our  purposes. 


The  dynamical  modes  are  obtained  from  the  equations  of  motion  by  assuming  no  mean 
flow,  mean  stratification  in  hydrostatic  balance,  and  a  flat  bottom.  With  horizontal  velocity 
and  pressure  proportional  to  0i(z),  and  vertical  velocity  and  density  perturbations 
proportional  to  d,(z),  the  equations  for  Boussinesq  linear  waves  (small  perturbations)  then 
separate  into  sets  of  equations  for  the  horizontal/time  dependence  and  vertical 
dependence.  Specifically,  with 


Vkiz,t)  1 

p(z,/)/p.J 


^[d,(x,y,r) 


(6a) 


and 
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I  wa,)  L 


where  p  =  p.  +  piz)+p'(z,t)  and  N^iz)  =  ®i  satisfy 

A 


(6b) 


.  de,  .  d^i  . 

(6c) 

with  the  boundary  conditions 

dj  =  0  or  ^  =  0  at  z  =  -f/, 
dz 

(6d) 

dz 

4«i=0or  ^+— ^,=0atz  =  0. 

K  dz  ^ 

(6c) 

Equations  (6c)  are  solved  numerically  with  the  appropriate  boundary  conditions  to 
produce  the  vertical  structure  functions  and  eigenvalues,  yf  The  structure  functions  are 
orthogonal  and  are  normalized  so  that 


(6f) 


This  normalization  means  that  the  (|>,  are  non-dimenidonal,  while  the  6,  have  dimensions  of 
length.  We  now  have  a  complete  orthonormal  basis  set  for  describing  any  quantity  that 
varies  with  depth.  The  fact  that  these  modes  are  “tuned”  to  the  depth-dependences  of  real 
oceanographic  variables  makes  them  very  us^l,  since  it  means  expanrions  in  terms  of 
these  modes  should  converge  rapidly.  For  our  {mrposes  in  this  section,  it  is  not  important 
what  assumptions  were  used  to  obtain  the  vertical  structure  equations  in  Eq.  (6c). 

Let’s  now  expand  conductivity  in  terms  of  the  dynamical  modes,  viz., 


^z,t) 

<<T(t)> 


=  ^Si(x,y,t)^i(z), 


(7) 


where  the  i=0  (barotropic)  mode  has  been  dropped  since  the  vertical  average  of  c  is  zero. 
Substituting  this  expression  and  those  in  Eq.  (6a)  into  Eq.  (S),  and  truncatir^  after  mode 
number  1,  yields 
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The  truncation  in  Eq.  (8)  is  quite  reasonable  given  the  expected  decrease  in  modal  current 
amplitudes  with  increasing  mode  number,  and  given  the  examples  in  Table  2  of  mean  s„ 
calculated  using  Levitus  (1982)  data  to  compute  structure  functions  and  conductivity 
profiles.  Table  2  suggests  that  in  polar  oceans  and  in  the  mid-latitude  Pacific  the 
conductivity-weighting  contribution  to  the  electric  field  is  completely  trivial.  Using  a  year 
of  electric  field  and  current  meter  mooring  data,  Luther  et  al.  (1991)  have  shown  the 
validity  of  Eq.  (S)  in  a  region  of  the  mid-latitude  North  Pacific  with  weak  mean  currents 
and  weak  baroclinic  eddy  activity.  In  that  location,  the  conductivity-weighting 
contribution  to  the  electric  field  was  completely  trivial. 

Table  2.  Expansion  coefficients  for  conductivity  from  Eq.  (7). 


Mode  (i) 

s, 

32.5°N 

Atlantic 

42.5°N 

Pacific 

57.5°S 

Atlantic 

1 

0.119 

0.017 

-0.004 

2 

-0.014 

0.021 

-0.009 

3 

-0.012 

-0.002 

-0.004 

4 

0.008 

0.008 

-0.001 

On  the  other  hand,  in  the  mid-latitude  North  Atlantic,  equal  amplitudes  of  the  barotropic 
(i-O)  and  first  baroclinic  (/=1)  modes  of  current  imply  a  -12%  relative  contribution  to  the 
electric  field  from  the  last  term  on  the  RHS  of  Eq.  (5).  Rossby  (1987)  found  first 
baroclinic  mode  amplitudes  up  to  70%  greater  than  barotropic  mode  amplitudes  in  the 
Gulf  Stream  at  73°W,  with  very  small  second  and  third  baroclinic  mode  amplitudes. 
Consequently,  in  the  Gulf  Stream  we  can  expect  up  to  20%  conductivity-weighting 
contributions  to  the  electric  field  due  to  the  first  baroclinic  mode  of  current.  In  fact,  our 
work  with  SYNOP  data  has  shown  occasional  conductivity-weighting  contributions  up  to 
30%,  although  the  mean  contribution  is  <15%. 

The  variation  of  conductivity  with  depth  in  the  oceans  (e.g.,  Luther  et  al.,  1991)  suggests 
dominance  of  the  first  baroclinic  mode  in  the  conductivity  contribution  to  Eq.  (S),  which 
allows  the  use  of  another  integrating  measurement,  vertical  acoustic  travel  time,  to 
estimate  first  baroclinic  mode  current  and  temperature  amplitudes  in  order  to  remove  the 
conductivity  contribution  from  the  HEF.  This  procedure  is  outlined  later. 


Horizontal  Electrometer  (HEM)  Versus  Mooring  Estimates  of  Transport 

In  deployments  of  seafloor  HEMs  to  date,  where  comparison  of  HEM  estimates  of  the 
vertically  averaged  horizontal  water  velocity,  <  V/^U)  >,  with  current  meter  mooring 
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estimates  of  the  same  quantity  have  been  possible,  the  HEM  estimates  have  proven  to  be 
more  accurate.  These  results  provide  an  example  of  how  measurement  of  an  integrating 
variable  provides  a  more  accurate  estimation  of  oceanic  behavior  than  can  be 
accomplished  with  a  suite  of  conventional  point  measurements.  In  this  specific  case  such 
accuracy  has  significant  importance  to  climate  studies  that  rely  on  estimates  of  transport 
(which  is  directly  proportional  to  horizontal  integrals  of  H<  v^(t)  >  for  determining  the 
world  ocean’s  role  in  climate  fluctuations. 

The  first  HEM  vs.  mooring  comparison  of  <  v^(r)  >  estimates  was  produced  by  Luther  et 
al.  (1991)  from  data  collected  during  BEMPEX,  an  experiment  that  deployed  a  large 
number  of  HEMs  and  pressure  gauges  in  the  North  Pacific  to  study  direct  atmospheric 
forcing  of  gyre-scale  eddies  (the  results  of  which  are  discussed  Anther  below).  The 
accuracy  of  the  HEM  estimates  of  <  v*(/)  >  was  corroborated  by  current  estimates  made 
by  reciprocal  tomography,  which  is  based  on  measuring  reciprocal  acoustic  travel  time 
differences  (Table  1).  The  inaccuracy  of  the  current  meter  mooring  estimates  was 
attributed  primarily  to  stalling  of  the  current  meter  rotors  in  the  weak  flows  below  1000 
meters  depth.  Another  electrometer-mooring  comparison  presented  below  comes  from  the 
opposite  extreme  for  oceanic  flows,  i.e.,  from  the  Gulf  Stream  which  has  strong  currents 
at  all  depths  so  that  rotor  stalling  is  not  expected  to  be  a  problem. 

The  Office  of  Naval  Research  provided  funds  for  us  (with  Jean  Filloux)  to  deploy  four  of 
Filloux’  seafloor  HEMs  (Filloux,  1987)  next  to  current  meter  moorings  during  the  last 
year  (‘89-‘90)  of  the  SYNOP  experim^  nt  in  the  Gulf  Stream.  The  HEMs  were  deployed  in 
an  array  centered  around  37.5®N,  68.5’’W,  at  depths  near  4700  m.  Near  each  HEM  were 
sub-surface  moorings  deployed  by  J.  Bane,  T.  Shay,  R.  Watts,  and  W.  Johns,  carrying 
current  meters  at  nominal  depths  of 400  m,  700  m,  1000  m  and  3500  m. 

The  LHSs  of  Eqs.  (5a)  and  (5b)  were  obtained  from  the  HEM  data  using  C=0.95,  as  per 
estimates  of  C  made  by  Sanford  et  al.  (1985)  in  the  western  North  Atlantic,  and  using  an 
appropriate  estimate  of  for  the  time  and  location  of  the  experiment,  while  the  RHSs 
were  estimated  from  the  mooring  data.  The  latter  estimation  included  extrapolation  of 
temperature  and  pressure  to  a  fictitious  nominal  100-m  depth,  conversion  of  pressure  to 
‘depth,’  and  estimation  of  conductivity  using  temperatures  and  a  climatological 
temperature/salinity  relation  in  an  empirical  formula.  The  currents  and  conductivities  at  the 
four  real  and  one  “fictitious”  instruments  were  trapezoidally  integrated,  taking  into 
account  the  time  dependence  of  the  depths  of  the  instruments.  The  time  series  thus 
obtained,  representing  opposite  sides  of  Eq.  (5),  are  highly  coherent,  as  shown  in  Figure  1. 

While  the  coherence  in  Figure  1  is  very  encouraging,  and  the  rms  differences  between  the 
estimates  of  the  LHS  and  RHS  of  Eq.  (5)  are  no  larger  for  instance  than  what  has  been 
considered  very  good  agreement  for  testing  schemes  to  remove  the  effects  of  mooring 
motion  from  current  meter  data  (e.g.,  Hogg,  1991;  Cronin,  1991),  examination  of  the 


112 


LUTHER  AND  CHAVE 


individual  time  senes  (not  shown)  indicates  that  the  LHS  of  Eq.  (S)  consistently  has  a 
greater  magnitude  than  the  RHS.  That  there  is  a  systematic  under-estimation  of  current  by 
the  mooring  data,  or  an  over-estimation  by  the  HEM  data,  is  most  easily  seen  by  casting 
the  data  in  terms  of  a  Gulf  Stream  coordinate  system,  rather  than  a  geographic  coordinate 
system,  since  the  Gulf  Stream  position  and  direction  vary  with  time. 


Daily  locations  of  the  temperature  front  of  the  Gulf  Stream  (provided  by  R.  Watts  and  W. 
Johns)  were  determined  from  an  array  of  Inverted  Echo  Sounders  (lESs)  that  measure 
VATT.  These  locations  permitted  estimation  of  the  cross-stream  positions  of  the  moorings 

every  day.  Gulf  Stream 
directions  at  each  mooring 


Period  (days) 


Frequency  (cph) 


were  determined  from  the  400- 
1000  m  shear.  Daily  estimates 
of  Eq.  (S)  were  then  rotated 
into  these  Gulf  Stream 
coordinates. 

To  put  the  problem  in  a  more 
interesting  context,  the 
conductivity  contribution  term 
(the  last  term  on  the  RHS  of 
Eq.  (5))  was  moved  to  the 
LHS  of  Eq.  (5).  Now  we  can 
compare  mooring  estimates  of 
vertically  averaged  water 
velocity,  <  v*(r)  >  ,  with  HEM 
estimates  of  the  same  quantity 
(that  incorporates  a  small 


Figure!.  Coherence  amplitudes 
between  electrometer  and  mooring 
estimates  of  conductivity-weighted, 
vertically  averaged  water  velocity 
(left  and  right  hand  sides  of  Eq. 

(S),  as  described  in  the  text).  Top  is 
for  zonal  currents  (Eq.  (Sa).  The 
95%  level  of  no  significance  is 
indicated.  Every  other  pmnt  plotted 
is  independent  due  to  a  50% 
overlap  of  frequency  band- 
averaging. 
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mooring-derived  conductivity  correction),  all  cast  in  terms  of  Gulf  Stream  coordinates. 
The  results  for  a  single  mooring  are  shown  in  Figure  2,  along  with  the  difference  (error) 
between  the  two  estimates.  (Note  that  the  error  is  not  dependent  upon  which  side  of  Eq. 
(5)  we  place  the  conductivity  correction  term.)  The  results  in  Figure  2  typify  the 
comparisons  made  at  other  HEM  locations.  Integrating  the  estimates  in  Figure  2  across 
the  stream  results  in  a  ~30%  higher  estimate  of  total  transport  from  the  HEM  than  from 
the  mooring.  This  is  certainly  non-trivial. 

Figure  2  shows  good  agreement  between  the  estimates  at  distances  farther  “south”  than 
-60  km  and  farther  “north”  than  30  km  from  the  north  wall  of  the  Gulf  Stream.  The 
percentage  difference  between  the  estimates  is  not  constant  across  the  stream,  implying 
that  the  difference  is  not  due  to  a  calibration  error  of  the  HEM.  While  there  are  many 
possible  noises  and  small  errors  in  the  HEM  data,  none  is  known  to  result  in  an 
overestimate  of  velocity.  We  believe  the  error  arises  in  the  current  meter  data  and/or  its 
trapezoidal  integration,  but  to  date  we  have  clearly  identified  only  one  source  of  error, 
which  by  itself,  however,  is  insufficient  to  account  for  all  of  the  error  in  Figure  2. 
Conductivity  and  temperature  versus  depth  (CTD)  data  taken  at  this  longitude  by  M.  Hall 
indicate  that  the  extrapolation  of  currents  to  the  near-surface  underestimates  the  upper 
ocean  velocities  (at  and  north  of  the  maximum  current)  and  the  trapezoidal  integration, 
which  implies  linear  interpolation  between  1000  and  3500  m,  underestimates  the  transport 


Cross-Stream  Distance  (km;  positive  north) 


Figure!.  Two  estimates 
of  vertically  averaged 
water  velocity  in  the 
Gulf  Stream  at 
nominally  68‘’W,  as  a 
function  of  cross-stream 
distance,  using 
electrometer  and 
mooring  data  (see  text). 
The  differeitces  between 
these  estimates  are  also 
plotted. 
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Figure  3.  Polar  stereographic 
projection  of  the  North  Pacific 
Ocean  di^laying  seafloor 
isobaths  from  4000  to  5500  m, 
and  the  locations  of  seafloor 
electrometers  (solid  circles)  and 
pressure  gauges  during 
BEMPEX.  Adjacent  land 
masses  are  also  shown. 


between  1000  and  3500  m  (agmn,  at  and  north  of  the  maximum  current).  Error  from  the 
trapezoidal  integration  is  further  suggested  by  the  fact  that  the  error  time  series  is  most 
highly  coherent  with  currents  measured  at  1000  m. 


Observation  of  Atmospheric  Forcing  of  Sub-Inertial  Gyre-Scale  Eddies 

A  good  example  of  the  use  of  measurements  of  integrating  variables  to  explore  a 
phenomenon  that  defied  unambiguous  detection  with  traditional  point  measurements  is  the 
Barotropic,  Electromagnetic  and  Pressure  Experiment  (BEMPEX).  BEMPEX  employed 
flEMs  and  bottom  pressure  gauges  to  specifically  test  theories  (Frankignoul  and  Muller, 
1979;  Muller  and  Frankignoul,  1981)  of  stochastic  forcing  by  the  atmosphere  of  sub- 
inertial  gyre-scale  motions  in  the  ocean.  BEMPEX,  fielded  by  ourselves  with  Jean  Filloux 
and  funded  by  the  National  Science  Foundation,  obtained  seven  HEF  records  and  five 
bottom  pressure  records  from  a  two-dimensional  array  spanning  1000  km  centered  around 
40®N,  163°W  (Figure  3)  and  lasting  1 1  months  be^nning  in  August,  1986.  Luther  et  al. 
(1991)  showed  that  the  conductivity  contribution  (Eq.  (5))  to  the  HEFs  in  BEMPEX  was 
trivial,  so  that  the  HEFs  were  directly  proportional  to  vertically  averaged  (barotropic) 
water  velocity.  In  the  following,  we’ll  simply  refer  to  the  barotropic  currents,  rather  than 
the  HEFs,  obtained  fi-om  the  HEMs. 

Four  of  the  observational  strategies  discussed  in  the  introduction  were  employed  in  the 
design  of  BEMPEX:  first,  isolation,  i.e.,  a  region  of  the  North  Pacific  was  chosen  for 
which  it  could  be  reasonably  assumed  that  other  sources  of  energy  for  gyre-scale  motions 
(such  as  instabilities  of  strong  “mean”  currents)  were  weak;  second,  measurements  of 
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integrating  variables,  HEFs  and  P^,,  were  planned,  because  the  theories  predicted  that  the 
oceanic  response  to  atmospheric  forcing  would  be  essentially  barotropic  at  the  sub-inertial 
periods  (i.e.,  a  few  days  to  a  few  months)  that  we  could  observe  reasonably  well  with  a 
one  year  record;  third,  a  spatial  array  was  planned  for  confirmation  of  theoretical 
predictions  of  frequency-wavenumber  relations,  and,  fourth,  visual  and  graphical 
comparisons  with  published  model  outputs  of  statistical  parameters  were  planned.  All 
these  process  discrimination  strategies  were  employed  because  previous  experiments  had 
found  that  detection  of  atmospherically  forced  gyre-scale  motions  was  difficult  with 
traditional  point  measurements  of  currents  and  because  the  point  measurements  showed 
significant  spatial  inhomogeneities  in  what  evidence  of  this  phenomenon  they  did  find. 
Figure  4  is  presented  as  an  example  of  how  the  integrating  variable  HEF  readily  provided 
evidence  of  atmospheric  forcing,  while  at  the  same  time  measurements  of  currents  in  the 
surface  mixed  layer  did  not,  probably  because  of  the  superposition  of  many  phenomena  in 
the  mixed  layer  that  have  different,  destructively  interfering  relationships  with  the  surface 
atmospheric  variables. 

The  Franidgnoul  and  Muller  papers  listed  above  were  the  first  papers  to  present  the 
physics  of  atmospherically  forced  meso-  and  gyre-scale  motions  (which  have  the  form  of 
linear  Rossby  waves)  in  the  realistic  light  of  stochastic  forcing;  and,  most  important  to 
empiricists,  they  presented  testable  hypotheses  in  the  form  of  intervariable  transfer  and 
coherence  functions.  One  example  of  the  latter  in  flat-bottomed  basins  is  the  prediction  of 


Zonal  Grid  Point  Zonal  Grid  Point 


Figure  4.  (a)  Contour  plot  of  squared  coherence  amplitude  between  BEMPEX  zonal  barotropic  current 
(measured  at  the  solid  square)  and  surface  zonal  wind  stress  (at  each  grid  point),  in  the  10-15  day  period 
band.  The  wind  stress  was  calculated  (Chave  et  al.,  1992b)  from  the  Fleet  Numerical  Oceanography 
Center’s  surface  wind  product.  Only  squared  coherence  amplitudes  greater  than  the  95%  level  of  no 
significance  are  plotted.  The  large  region  of  significant  coherence  indicates  a  strong  relationship  between 
oceanic  barotropic  (depth-independent)  zonal  current  and  atmospheric  forcing,  (b)  As  for  (a),  except  with 
zonal  current  measured  at  nominally  73  m  depth  on  a  sub-surface  mooring  located  near  the  electrometer 
in  (a).  The  lack  of  significant  coherence  is  inteipreted  as  a  null  result,  providing  no  information  on  the 
relatedness  of  oceanic  near-surface  zonal  current  and  surface  zonal  winds. 
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Figure  5.  Contour  plot  of  squared  coherence 
amplitude  between  BEMPEX  meridional 
barotropic  current  (measured  at  the  solid  square) 
and  surface  wind  stress  curl  (at  each  grid  point), 
in  the  25-70  day  band.  Plotted  as  in  Figure  4. 

The  significant  coherence  surrounding  the  solid 
square  suggests  the  oceanic  meridional 
barotropic  current  is  nearly  in  Sverdrup  balance 
with  the  wind  stress  curl  (see  text).  This  is  the 
only  location  (out  of  7),  and  the  only  period  band 
at  this  location,  that  exhibited  a  Sverdrup-like 
behavior. 


Strong  coherence  between  meridional  currents  and  local  wind  stress  curl  at  periods  greater 
than  0(100  days).  The  coherence  arises  from  the  dominance  of  a  “Sverdrup”  balance  in 
the  vorticity  conservation  equation,  in  which  the  curl  of  the  wind  stress,  which  is  a  source 
of  vorticity,  is  balanced  by  a  meridional  advection  of  planetary  vorticity.  The  coherence 
does  not  occur  at  shorter  periods  due  to  destructive  interference  from  many  shorter  scale 
waves  with  non-trivial  relative  vorticity.  For  basins  with  gently  sloping  bottoms,  a 
“topographic  Sverdrup”  balance  obtains  between  wind  stress  curl  and  oceanic  currents 
that  are  perpendicular  to  isopleths  of  potential  vorticity,///,  where /is  the  Coriolis 
parameter. 

Evidence  for  the  flat-bottom  Sverdrup  relation  was  found  in  BEMPEX  (Fig.  5),  and 
evidence  for  the  topographic  Sverdrup  relation  was  reported  by  Niiler  and  Koblinsky 
(1985).  But,  the  coherence  shown  in  Figure  5  did  not  occur  at  any  other  period  for  that 
instrument,  nor  was  there  Sverdrup-like  coherence  at  any  period  for  the  other  six  HEMs. 
Furthermore,  a  systematic  search  of  North  Pacific  current  meter  records  by  Koblinsky  et 
al.  (1989)  produced  no  further  examples  of  a  topographic  Sverdrup  balance  of  oceanic 
currents.  The  problem  lies  with  the  generation  of  short-scale  Rossby  waves  by  short-scale 
topography  as  the  wind  stress  curl  drives  the  water  across  isopleths  of f/H  (Anderson  and 
Corry,  1985;  Cummins,  1991).  The  short-scale  waves  have  substantial  relative  vorticity, 
so  that  a  Sverdrup  balance  usually  does  not  dominate  vorticity  conservation  until  very 
long  periods.  Cummins  (1991)  demonstrated,  with  a  numerical  model  of  the  North  Pacific 
having  realistic  topography,  that  by  spatially  filtering  out  the  shorter  scale  waves  the 
Sverdrup  balance  of  the  longer  waves  can  be  recovered.  Following  Cummins,  we  have 
averaged  the  meridional  currents  from  the  five  HEMs  that  comprised  a  coherent  sub-array 
in  BEMPEX  (Fig.  3).  The  resultant  averaged  meridional  currents  were  coherent  with  wind 
stress  curl  at  all  periods  >10  days;  Figure  6  shows  the  coherences  from  two  period  bands. 

BEMPEX  yielded  many  significant  statistics  (Chave  et  al.,  1992b)  with  which  to  determine 
the  kinematics  of  the  oceanic  Rossby  waves  and  with  which  to  test  Muller  and 
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Figure  6.  Contour  plot  of  squared  coherence  amplitude  between  averaged  BEMPEX  meridional  barotropic 
currents  (measured  at  the  5  southernmost  electrometers  in  Figure  3)  and  surface  wind  stress  curl,  in  the 
(left)  25-70  day  baitd,  and  (right)  13-19  day  band.  A  Sverdrup-like  relationship  (see  text)  is  evident  in 
both  period  bands,  and  at  all  other  periods  greater  than  10  days,  for  the  averaged  meridional  barotropic 
current.  The  solid  square  in  both  plots  locates  the  nominal  center  of  mass  of  the  five  electrometers. 
Otherwise  plotted  as  in  Figure  4. 

Frankignoul’s  (1981)  predictions  of  frequency-dependent  local  coherence  between  various 
oceanic  and  atmospheric  variables.  Non-zero  coherences  between  oceanic  variables  and 
non-local  atmospheric  variables,  predicted  by  Brink  (1989),  were  also  unambiguously 
observed  (Luther  et  al.,  1990;  Chave  et  al.,  1992b).  No  point  measurements  of  currents 
have  yielded  such  clear  evidence  of  direct  atmospheric  forcing  of  Rossby  waves  as  has 
been  obtained  with  measurements  of  the  integrating  variables,  HEF  and  (the  latter  to  be 
discussed  further  below). 

The  example  above,  describing  efforts  to  confirm  the  relatively  simple  physics  inherent  in 
the  Sverdrup  balance,  emphasizes  the  non-homogeneity  of  even  the  larger  scale  barotropic 
motions  in  the  ocean.  Statistics  estimated  from  observations  of  these  phenomena  are 
correspondingly  inhomogeneous.  Any  observational  program  or  statistical  analysis 
technique,  such  as  some  of  those  highlighted  at  this  workshop,  must  address  these 
inhomogeneities  or  risk  misdirected  inferences. 


BOTTOM  PRESSURE  (P*) 

The  complete  relationship  between  pressure  and  water  velocity  in  the  oceans  is  not  easily 
represented  by  a  simple  integral.  To  lowest  order,  however,  mid-latitude,  sub-inertial 
motions  are  geostrophic,  i.e., 

v*=— 

fP, 


(9a) 
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which  permits  the  derivation  of  a  simple  relationship  between  pressure  and  the  mass  flux 
per  unit  vertical  distance  (Pedlosky,  1987),  viz., 

P(4)  =  /|^(P,v*xdr)  +  p(<^o).  W 

where  ^  and  ^  are  two  points  in  a  horizontal  plane;  dr  is  an  incremental  vector  parallel  to 
an  arbitary  curve  running  from  to  so  long  as  p(<^)  >  p(4o );  P,  =  p.  +  P(z)’.  and  k  is 
the  local  upward  unit  vector. 

Like  HEFs,  pressure  is  related  to  a  spatial  integral  of  horizontal  velocity.  Unlike  HEFs,  the 
spatial  distance  over  which  the  integral  operates  is  somewhat  arbitrary  for  pressure.  But, 
the  greater  the  separation  between  members  of  a  set  of  pressure  measurements,  the 
weaker  the  correlation  between  them  due  to  the  substantial  wavenumber  bandwidth  of 
oceanic  sub-inertial  motions.  Lack  of  coherence  is  usually  fatal  for  process  studies  but  is 
often  considered  irrelevant  for  basin-wide  studies  of  transport,  for  instance.  The 
integrating  nature  of  pressure  is  in  large  part  responsible  for  the  successful  mapping  of  the 
semi-permanent  oceanic  flows  with  hydrographic  (temperature  and  salinity  versus  depth) 
data,  from  which  pressure  is  calculated,  because  smaller  scale  variability  tends  to  have  a 
weaker  impact  on  pressure  (which  can  be  argued  from  either  Eq.  (9a)  or  Eq.  (9b)). 

In  addition  to  discriminating  against  smaller  scales  of  motion,  bottom  pressure 
discriminates  against  baroclinic  motions  in  favor  of  barotropic.  This  follows,  for  example, 
from  the  vertical  structure  functions,  ^,(i),  that  are  determined  from  Eq.  (6).  The 
barotropic  mode  is  independent  of  depth,  while  the  baroclinic  modes  have  their  largest 
amplitudes  near  the  sea  surface.  If  the  barotropic  and  baroclinic  modes  have  identical  total 
kinetic  energy,  integrated  from  the  seafloor  to  the  sea  surface,  then  the  barotropic  mode 
will  have  a  larger  amplitude  at  the  seafloor  than  any  of  the  baroclinic  modes.  Since  the 
barotropic  and  first  baroclinic  modes  typically  have  similar  kinetic  energies  (and  the  higher 
modes  are  weaker),  (but  not  pressure  in  the  upper  ocean)  tends  to  be  dominated  by 
large-scale  barotropic  motions,  even  in  regions  of  the  oceans  with  energetic  baroclinic 
mesoscale  motions  such  as  the  western  North  Atlantic.  This  latter  point  accounts  for  the 
large  horizontal  correlation  of  sub-inertial  P/,  found  over  distances  of  hundreds  of 
kilometers  in  the  western  North  Atlantic  by  Brown  et  al.  (1975)  during  the  Mid-Ocean 
Dynamics  Experiment,  while  horizontal  correlations  of  currents  and  temperatures  in  the 
same  general  area  tend  to  zero  when  separations  of  0(100  km)  are  attained  (Owens, 

1985). 
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Considering  the  tendency  of  to  be  more  or  less  dominated  by  the  large-scale  sub- 
inertial  motions,  we  might  expect  that  in  BEMPEX  will  be  less  affected  by  the  short- 
scale  waves  that  made  detection  of  the  Sverdrup  balance,  for  instance,  so  difficult  udth 
point  measurements  of  currents  or  even  HEF-derived  barotropic  currents.  In  fact,  we  do 
find  from  BEMPEX  that  is  much  more  coherent  with  surface  atmospheric  variables 
(Fig.  7)  than  are  the  barotropic  currents.  And,  the  coherence  between  the  pressure  records 
is  greater  than  found  for  the  barotropic  currents,  despite  the  larger  separation  of  the 
pressure  gauges  (Fig.  3).  The  extended  regions  of  high  squared  coherence  in  Figure  7  are 
not  so  much  evidence  of  waves  reaching  the  instrument  from  all  over  the  Pacific  as  they 
are  evidence  of  high  horizontal  coherence  in  the  surface  atmospheric  fields  themselves. 

The  non-local  maximum  of  the  squared  coherence  in  Figure  7  is  expected  from  the 

dominance  of  propagating 
waves  over  locally  forced 
motions  at  these  periods 
(Brink,  1989).  As  the 
period  decreases,  the 
maximum  coherence 
between  P^  and  air  pressure 
or  wind  stress  curl  becomes 
more  local  (Luther  et  al.. 


Figure  7.  Contour  plot 
of  squared  coherence 
amplitude  between 
BEMPEX  bottom 

40  35  30  25  20  pressure  (measured  at  the 

Zonal  cm  Point  solid  s<l»are)  aM  (top) 

surface  air  pressure,  or 

(bottom)  surface  wind 
stress  curl,  both  in  the 
19-38  day  band.  Plotted 
as  in  Figure  4.  The 
extended  regions  of 
strong  coherence  are 
typical  for  the  bottom 
pressure  records  at  most 
periods,  unlike  the 
coherences  found  for 
barotropic  currents 
which  tended  to  be 
weaker,  less  extensive, 
more  spatially 
inhomogeneous,  and 

40  35  30  25  20  clearly  significant  in 

Zonal  GHd  Point  fewer  period  bands. 
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1990),  in  accordance  with  the  disappearance  of  freely  propagating  Rossby  waves  (Miiller 
and  Frankignoul,  1981). 

Bottom  pressure  is  so  dominated  by  the  larger  scale  barotropic  motions  that  all  the  Pj, 
records  from  BENflPEX  display  very  similar  coherence  relationships  with  the  atmospheric 
variables,  unlike  the  situation  for  the  barotropic  currents  which  exhibit  more 
inhomogeneities  in  their  relationships  with  atmospheric  variables.  For  Pf,,  atmospheric 
forcing  is  clear  at  all  sub-inertial  frequencies,as  seen  by  the  graphs  of  maximum  coho'ence 
in  Figure  8.  The  fact  that  the  coherence  of  P^  with  air  pressure  is  frequently  higher  than  its 
coherence  with  wind  stress  curl  (Fig.  8)  does  not  implicate  a  particular  forcing  mechanism, 
because  the  atmospheric  variables  are  coherent  among  themselves,  and  there  is  more  noise 
in  wind  stress  curl  than  in  air  pressure.  A  simple  scaling  argument  shows  (Philander,  1978) 
that  divergence  of  the  surface  (Ekman)  boundary  layer,  produced  by  the  curl  of  the  wind 
stress,  should  dominate  all  other  forcing  mechanisms  at  the  time  and  space  scales  observed 
in  BEMPEX  (Chave  et  al.,  1992a). 

COMBINING  HEM  AND  lES  MEASUREMENTS 

The  intent  of  this  section  is  to  demonstrate  the  great  potential  of  combining  measurements 
of  two  integrating  variables  listed  in  Table  I .  The  combination  of  measurements  of 
horizontal  electric  fields  (HEFs)  and  vertical  acoustic  travel  times  (VATTs)  can  provide 
estimates  of  (1)  volume  transport  per  unit  width,  (2)  the  gravest  vertical  structure  (i.e., 
barotropic  and  first  baroclinic  modes)  of  the  horizontal  currents,  and  (3)  the  total  heat  flux 
(using  the  gravest  vertical  structures  of  the  currents  and  temperature).  Because  seafloor 
HEMs  and  lESs  are  inexpensive  to  make  and  deploy  compared  to  current  meter  moorings, 
it  is  not  unreasonable  to  envision  the  deployment  of  large  arrays  of  HEMs  and  lESs  for 
both  dynamical  process  studies  and  the  accumulation  of  transport  time  series  for  climate 
studies.  That  most  of  the  ocean’s  low  frequency  structure  and  variability  can  therd^y  be 
observed  from  the  seafloor  using  integrating  variables  is  quite  remarkable. 

The  VATT  measured  by  an  lES  is 

T=2j— .  (10) 

^  C 


where  H  is  the  bottom  profile,  and  c  =  c(z,r)  is  the  speed  of  sound.  Potential  small  errors 
in  the  interpretation  of  VATT  in  terms  of  the  simple  relation  in  Eq.  (10)  have  been 
enumerated  by  Watts  and  Rossby  (1977). 
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Figures.  Maximum 
squared  coherence 
amplitude,  over  the 
oceanic  domain  of 
Figure  3,  between  each 
BEMPEX  bottom 
pressure  record  and 
(top)  surface  air 
pressure,  or  (bottom) 
surface  wind  stress  curl, 
plotted  as  a  function  of 
frequency.  The  95% 
level  of  no  significance 
is  indicated  in  each  plot. 
For  each  station,  every 
other  point  plotted  is 
independent 
due  to  a  50%  overlap  of 
frequency  band¬ 
averaging.  The 
ubiquitously  high 
coherence  maxima 
indicate  that  bottom 
pressure,  and  hence  the 
large-scale  barotropic 
motions  that  it 
represents,  is  strongly 
related  to  atmospheric 
forcing  in  the  central 
North  Pacific. 
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^irst  Baroclinic  Dbplacemcnt  Mode  Amplitude 

Consider  temperature,  T,  salinity,  S,  and  pressure,  P,  as  state  variables,  so 
c(x,t)  =  c(T(x,t),S{x,t),P(x,t)).  Following  Pickart  and  Watts  (1990),  we  idealize 
variations  in  T  and  S  as  perturbations  on  a  base  profile  which  varies  only  with  z  (pressure 
is  not  perturbed  since  it  is  essentially  the  integration  variable),  therefore 

T(z.t)  =  f(zH(z.t))  (11) 

and  similarly  for  S,  where  « |z|  by  assumption  We  now  expand  C  in  terms  of 
displacement  modes  per  Eq.  (6),  such  that 

C(z.0  =  Xqi(0e,(z).  (12) 

where  the  q,  are  non-dimensional  since  the  6,  have  dimensions  of  length.  Substituting  Eq. 
(12)  into  the  perturbation  forms  of  T  and  <S,  and  truncating  after  mode  1,  the  sound 
velocity  can  be  written 

c(z,t)  =  c[f(z+q,0,),S(z+q,6,),P(z)].  (13) 

After  the  basic  state  profiles  are  chosen,  numerical  evaluation  of  c  based  on  its  empirical 
dependence  on  T,  S  and  P,  using  different  values  for  qj,  leads  to  a  functional  relationship 
between  t  and  q|  (Pickart  and  Watts,  1990),  which  can  be  inverted  to  yield  the  amplitude 
of  the  first  baroclinic  displacement  mode  for  any  measured  VATT.  In  practice,  since  the 
depth  is  never  known  precisely  enough,  in  situ  profiles  of  T  and  S  must  be  taken  (by  CTD 
or  XBT)  while  the  lES  is  deployed  in  order  to  calibrate  the  VATT.  Pickart  and  Watts 
(1990)  have  shown  evidence  that  the  relationship  between  x  and  qj  in  Eq.  (13)  is  not 
sensitive  to  the  choice  of  basic  state  profile  of  S  (or  buoyancy  frequency,  N,  used  in  Eq. 
(6)),  although  they  do  note  that  the  choice  of  basic  state  T  profile  is  important,  and  a 
climatological  mean  T  profile  is  inadequate  in  frontal  regions  such  as  the  Gulf  Stream. 

The  strong  (weak)  dependence  of  VATT  on  the  first  (other)  baroclinic  mode  for  mid¬ 
latitude  hydrographic  profiles  has  been  documented  by  Watts  and  Rossby  (1977)  and 
Pickart  and  Watts  (1990).  (Also,  Hall,  1986,  and  Pickart  and  Watts,  1990,  have  shown 
with  current  meter  .iai  &  that  the  first  baroclinic  mode  dominates  the  vertical  velocity, 
hence  also  the  vertical  displacement,  in  the  Gulf  Stream.)  In  the  tropics,  however,  second 
baroclinic  mode  variability  makes  a  non-trivial  contribution  to  the  VATT  and  cannot  be 
ignored  (Garzoli  and  Katz,  1981).  In  what  follows,  we  are  assuming  the  application  is  at 
mid-  to  high-latitudes. 
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First  Baroclink  Current  Mode  Amplitude 


Departing  from  previous  authors,  we  develop  an  expression  for  the  amplitude  of  the  first 
baroclinic  mode  of  current  as  follows.  Under  the  hydrostatic  and  geostropMc 
approximations, 

(14) 

dz  P* 

Let  there  be  small  perturbations  of  p  as  per  Eq.  (1 1),  so  that 


g  dp 
Bi  fp^  dz 


*xV*C. 


(15) 


Substituting  the  modal  expansions  for  and  ^  (see  Eqs.  (6a)  and  (12))  in  Eq.  (15), 
applying  the  second  relation  in  Eq.  (6c),  and  truncating  after  mode  1  yields  an  expression 
for  the  amplitudes  of  the  first  baroclinic  current  modes,  viz., 

a*i=-^ixV»q„  (16) 

where  yf  is  the  first  baroclinic  mode  eigenvalue  determined  from  Eq.  (6).  Note  that  none 
of  the  physical  assumptions  leading  to  Eq.  (16),  except  the  modal  truncation,  is  more 
severe  than  is  typically  used  to  estimate  relative  or  absolute  (P  spiral)  currents  from 
hydrographic  data  or  to  estimate  cross-Gulf  Stream  profiles  of  current  (and  transport, 
after  upward  extrapolations)  from  single  moorings  (e.g.,  Hogg,  1992). 

Analysis  of  the  combined  HEF  and  VATT  datasets  from  the  SYNOP  experiment  is  in  its 
early  stages,  but  we  can  show  a  simple  preliminary  comparison  of  two  derivations  of  one 
horizontal  component  of  a* ,  in  Figure  9a.  Rather  than  using  observed  VATTs  to  estimate 
first  mode  displacement  from  Eq.  (16),  we  simply  assumed  that  the  difference  of  the 
measured  VATTs  from  two  BBSs  is  proportional  to  the  first  mode  current  amplitude,  then 
estimated  the  constant  of  proportionality  by  least  squares.  The  result  is  the  dotted  curve  in 
Figure  9a.  The  solid  curve  in  Figure  9a  is  an  average  of  the  first  mode  current  amplitudes 
from  three  moorings,  two  at  the  endpoints,  and  one  close  to,  the  line  running  between  the 
two  lESs.  The  agreement  between  the  curves  is  certainly  encouraging. 

Volume  Transport  Per  Unit  Width 

Our  estimate  of  volume  transport  per  unit  width  is  simply  a*  ^  from  Eq.  (8)  times  the 
depth,  H.  To  solve  Eq.  (8),  we  need  estimates  of  S]  and  a*,.  The  latter  are  obtained  from 

the  lESs  by  Eq.  (16).  The  former  are  obtiuned  by  reconstructing  a  time-dependent 
conductivity  profile  (using  BES-derived  estimates  of  qj  in  an  expression  for  conductivity 
anular  to  that  for  sound  speed  in  Eq.  (13)),  which  is  then  decomposed  according  to  Eq. 
(7). 
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Figure  9.  (a)  IBS  (dotted  curve)  and  mooring  estimates  (see  text)  of  one  component  the  vector 
amplitude  (rf' the  first  batoclinic  mode  (tf' horizontal  current,  (b)  HEM/IES  (dotted  curve)  and 

mooring  estimates  (see  text)  of  one  component  of  the  vector  anqilitude  ot  the  barotropic  mode  cf 
hmizontal  current,  Sj^  o-  Data  for  both  plots  were  taken  during  the  SYNCH*  ejqteriment  in  the  Gulf  Stream 
at  nominally  68^W.  Ordinate  units  are  m/sec. 


As  in  Figure  9a,  a  quick  estimate  of  that  component  of  g  parallel  to  a^^ ,  shown  in  Figure 
9a,  is  presented  as  the  dotted  curve  in  Figure  9b.  The  lES-derived  estimate  of  ,  in 
Figure  9a  was  used  in  Eq.  (8)  with  a  climatological  mean  S].  An  average  of  the  data  from 
two  HEMs  (dq)loyed  near  the  lESs)  was  used  in  Eq.  (8)  as  well.  The  only  calibration 
employed  was  that  for  the  first  mode  amplitude  described  above.  The  solid  curve  in  Figure 
9b  is  an  average  of  barotropic  mode  current  amplitudes  from  the  same  three  moorings 
used  in  Figure  9a. 

[Note  that  the  comparison  in  Figure  9b  is  not  directly  relatable  to  the  HEM-mooring 
comparison  of  transit  estimates  discussed  previously,  and  evidenced  by  Figures  1  and  2, 
because  Figure  9b  only  shows  one  of  the  two  horizontal  components  of  a^  g,  and  Figure 

9b  is  necessarily  drived  from  data  spanning  about  SO  km,  vidiereas  the  data  for  the  prior 
comparison  were  all  (4>tained  at  a  single  geographic  location.] 

Current  Profiles  and  Heat  Flux 

The  large  vertical  scale  currrats,  Vj^Cz,/),  are  reconstructed  by  adding  a^^g  and 

A  1*^  readily  obtained  from  this  reconstructed  current  profile  and  a 
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reconstructed  potential  temperature  profile,  following  Eqs.  (1 1)  and  (12)  truncated  after 
mode  1. 


Summary  of  Some  Oceanic  Variables  Derivable  from  HEFs  and  VATTs 

An  ideal  array  would  result  in  at  least  three  lESs  situated  around  each  HEM.  (Note  that 
this  does  not  mean  that  three  times  as  many  lESs  are  deployed  as  HEMs.)  After  choosing 
appropriate  basic  state  temperature,  T (z),  and  salinity,  S  (z),  profiles,  preferably  from 
coincident  CTD  profiles  rather  than  clinuuological  mean  profiles,  the  following  are 
estimated; 


q,(/),  for  each  lES  (see  Eq.  (13))  and  subsequent  discussion); 

1  ** 

qi  (/)  = — 

(z).  the  base  state  potential  temperature  profile,  from  the  equations  of  state; 

7*,  (z,/)  =  7),  (z + q,  (/)fi,  (z)); 

o(z).  the  basic  state  conductivity  profile,  from  the  equations  of  state;  and 
<T(z,t)  =  o(z+q,(t)fi,(z)). 

Then  the  amplitudes  of  the  first  baroclinic  modes  of  horizontal  current  are  estimated  from 
Eq.  (16),  viz., 

where  the  eigenvalue  yf  is  obtained  from  solving  Eq.  (6)  with  a  basic  state  buoyancy 
profile,  (z),  derived  from  the  equations  of  state  using  T  (z)  and  5  (z> .  The  barotropic 
modal  amplitudes  follow  from  Eq.  8,  viz.. 


_  £!,(t) 

C|f,| 

. 

a.n  “ 


V.0 


C\R 


~s,aj,j. 


where  Sj  is  obtained  from  Eq.  (7),  using  aiz,t)  from  above. 
Finally,  we  arrive  at  estimates  of  the  following  oceanic  quantities: 
•  Volume  transport  per  unit  width  =  Ha^^o*, 
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•  Horizontal  current  profiles,  v* (z,r)  =  a* „  +  (z)',  and 

0 

•  Un>normalized  heat  transport  per  unit  width  =  J p.C^v* {z,t)T^ (z,/)dz, 

-H 

where  Cp  is  the  specific  heat  of  seawater  at  constant  pressure. 


CONCLUSIONS 

The  ability  to  observe  variables  (such  as  those  listed  in  Table  1)  that  are  natural  spatial 
integrals  of  water  motion  or  state  properties  in  the  oceans  provides  a  useful,  yet 
underutilized,  strategy  for  process  discrimination  in  field  experiments.  For  those  situations 
when  observation  of  an  integral  quantity,  like  volume  transport,  is  the  desired  end  result, 
integrating  variables  are  likely  to  yield  more  accurate  results  than  point  measurements  of 
currents  or  state  properties,  as  the  one  example  presented  above  indicates.  Integrating 
variables  should  also  be  more  useful  than  point  measurements  for  validation  of  numerical 
models  of  large-scale  processes,  because  these  variables  in  the  ocean  are  not 
“contaminated”  by  short-scale  processes  that  are  not  simulated  in  the  models. 

Specific  estimation  of  statistics  from  integrating  variables,  examples  of  which  were  shown 
previously,  demonstrate  that  even  large-scale  oceanic  processes  with  the  simplest  physics 
exhibit  significant  spatial  inhomogeneities.  Any  modelling  effort,  observational  program, 
or  statistical  analysis  technique,  such  as  some  of  those  highlighted  at  this  workshop,  must 
address  these  inhomogeneities  or  risk  misdirected  inferences. 
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1.  INTRODUCTION 

The  useful  information  in  a  signal  is  usually  carried  by  both  its  frequency  content  and  its 
time  evolution.  If  we  consider  only  the  time  representation,  we  do  not  know  the  spectrum, 
whereas  the  Fourier  spectral  representation  does  not  give  information  on  the  time  of 
occurrence  of  each  frequency.  A  more  appropriate  representation  should  combine  these 
two  complementary  descriptions.  This  is  true  in  particular  for  turbulent  signals,  especially 
those  presenting  bursts  or  some  intermittent,  quasi-singular  behaviours.  The  uncertainty 
principle  precludes  analysis  of  the  signal  from  both  sides  of  the  Fourier  transform  at  the 
same  time  because  of  the  condition  At-Av  >  1  (normalized  information  cell).  Therefore  it  is 
always  a  compromise;  either  good  time  resolution  Ai  but  loss  of  spectral  resolution  Av, 
which  is  the  case  when  we  sample  a  signal  by  convolving  it  with  a  Dirac  comb  (Fig.  la),  or 
good  spectral  resolution  Av  but  loss  of  time  resolution  A/,  which  is  the  case  with  the 
Fourier  transform  (Fig.  lb).  These  two  transforms  are  the  most  commonly  used  in  practice 
because  they  allow  construction  of  orthogonal  bases  onto  which  the  signal  can  be 
projected  for  analysis  and  eventual  computation. 

In  order  to  improve  time  resolution  while  using  the  Fourier  transform,  Gabor  (1946)  has 
proposed  the  windowed  Fourier  transform,  which  consists  of  convolving  the  signal  with  a 
set  of  Fourier  modes  localized  in  a  Gaussian  envelope  of  constant  width  (Fig.  Ic).  This 

transform  allows  then  a  time-frequency  decomposition  of  the  signal  at  a  given  scale  a^, 
which  is  kept  fixed.  But  unfortunately,  as  shown  by  Balian  (1981),  the  bases  constructed 
with  such  windowed  Fourier  modes  cannot  be  orthogonal.  More  recently,  Grossmann  and 
Morlet  (1984,  1985)  have  devised  a  new  transform,  the  so-called  wavelet  transform, 
which  consists  of  convolving  the  signal  with  a  set  of  affine  functions  all  presenting  the 
same  frequency  Vj,;  the  family  of  analysing  wavelets  \j/„  is  obtained  by  dilation  and 
translation  of  a  given  function  xj/  presenting  at  least  one  oscillation.  The  wavelet  transform 
allows  therefore  a  time-scale  decomposition  of  the  signal  at  a  given  frequency  Vq,  which  is 
kept  fixed.  Actually  the  wavelet  transform  realizes  the  best  compromise  of  the  uncertainty 
principle,  because  it  adapts  the  time-frequency  resolution  A/  Av  to  each  scale  a.  In  fact  it 
gives  a  good  spectral  resolution  Av  with  a  limited  time  resolution  At  in  the  large  scales, 
but  also  gives  a  good  time  localization  A/  with  a  limited  spectral  resolution  Av  in  the  small 
scales  (Fig.  Id).  The  continuous  wavelet  transform  has  been  extended  to  n  dimensions  by 
Murenzi  (1989). 
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In  1985  Meyer,  while  trying  to  prove  the  impossibility  of  constructing  orthogonal  bases, 
as  Balian  had  earlier  done  for  the  case  of  the  windowed  Fourier  transform,  was  surprised 
to  discover  an  orthogonal  wavelet  basis  built  with  spline  functions,  now  called  the  Meyer- 
Lemarie  wavelets  (Lemarie  and  Meyer,  1986).  In  fact  the  Haar  orthogonal  basis,  which 
had  been  proposed  in  1909,  is  now  recognized  as  the  first  orthogonal  wavelet  basis 
known,  but  the  functions  it  uses  are  not  regular,  which  drastically  limits  its  application.  In 
practice  one  likes  to  build  orthogonal  wavelet  bases  using  functions  having  a  prescribed 
regularity  to  provide  enough  spectral  decay  depending  on  the  application.  In  particular, 
following  Meyer's  work,  Daubechies  (1988)  has  proposed  new  orthogonal  wavelet  bases 
built  with  compactly  supported  functions  of  prescribed  regularity  defined  by  discrete 
quadratic  mirror  filters  (QMF)  of  different  lengths,  the  longer  the  filter,  the  more  regular 
the  associated  functions.  Mallat  (1989)  has  devised  a  fast  algorithm  to  compute  the 
orthogonal  wavelet  transform  using  wavelets  defined  by  QMF;  it  has  been  used  in 
particular  to  devise  more  efficient  techniques  for  numerical  analysis  (Beylkin,  Coifinan, 
and  Rokhlin,  1992).  Then,  more  recently,  Malvar  (1990),  Coifinan  and  Meyer  (1991) 
found  a  new  kind  of  window  of  variable  width,  which  allows  the  construction  of 
orthogonal  adaptative  local  cosine  bases.  The  elementary  functions  of  such  bases  are  then 
parametrized  by  their  position  6,  their  scale  a  (width  of  the  window),  and  their 
wavenumber  k  (proportional  to  the  number  of  oscillations  inside  each  window).  In  the 
same  spirit,  Coifean  et  al.  (1990),  Wickerhauser  (1990),  and  Coifinan,  Meyer,  and 
Wickerhauser  (1992)  have  proposed  the  so  called  wavelet  packets  which,  similarly  to 
compactly  supported  wavelets,  are  wavepackets  of  prescribed  regularity  defined  by 
discrete  QMF,  from  which  one  can  construct  orthogonal  bases.  A  review  of  the  different 
types  of  wavelet  transforms  and  their  applications  to  analysis  and  computation  of  turbulent 
flows  in  2D  and  3D  is  given  in  Farge  (1992a,b). 

2.  THE  CONTINUOUS  WAVELET  TRANSFORM 

The  only  condition  a  function  ij/ix)  e  L^(5i),  real  or  complex-valued,  should  satisfy  to  be 
called  a  wavelet  is  the  admissibility  condition. 


C(vr)  =  2;tJ_”|v^(k)f^<oo 

(1) 

¥ik)  =  [j{x)e-‘^dx. 

(2) 

If  \|/  is  integrable,  this  condition  implies  that  the  wavelet  has  a  zero  mean  . 

J  yfix)dx  =  0ovxj/  =  0.  (3) 
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In  practice  one  also  wishes  the  wavelet  to  be  as  localized  as  possible  on  both  sides  in 
Fourier  transform,  namely  that 

1 

with  Aq  being  the  frequency  of  the  wavelet  and  n  as  large  as  possible. 


(4) 

(5) 


Figure  2  shows  examples  of  the  most  commonly  used  wavelets;  the  Marr  wavelet  (Fig. 

2a),  also  called  the  Mexican  hat,  a  real<valued  function  used  for  the  isotropic  continuous 
wavelet  transform,  the  Morlet  wavelet  (Fig.  2b),  a  complex-valued  function  used  for  the 
non-isotropic  continuous  wavelet  transform,  the  Meyer-Lemarie  wavelet  (Fig.  2c),  and  the 
Daubechies  wavelet  (Figs.  2d,2e),  real-valued  functions  used  to  build  orthogonal  bases. 

For  several  applications,  in  particular  to  study  fractals,  one  also  wishes  the  wavelet  to  have 
a  good  regularity,  namely  that  ^(k)  decays  rapidly  near  zero  or,  equivalently,  that  the 
wavelet  has  enough  cancellations  such  as 

J  \l/(x)j^dx  =  0  (6) 


with  n  as  large  as  possible. 


Then,  after  having  chosen  the  so-called  ‘mother  wavelet’  \|/,  one  generates  the  family  of 
wavelets  'F/>^a(Jc),  by  continuously  translating  the  ‘mother  wavelet’  \jr  along  the  signal  b 
and  continuously  dilating  it  to  all  accessible  scales  a,  which  gives 


N(fl) 


(7) 


with  N(a)  a  normalization  coefficient  equal,  either  to  if  one  wishes  the  squared 
modulus  of  the  wavelet  coefficients  to  correspond  to  an  energy  density  (L^  norm),  or  to  a 
if  one  uses  the  wavelet  coefficients  to  analyze  the  local  regularity  of  the  signal  (L*  norm). 

The  continuous  wavelet  analysis  of  the  function  f{x)  e  L^(9l)  is  then  the  inner  product 
between^x)  and  the  set  of  all  translated  and  dilated  wavelets  such  as 

hb,a)  =  [j{xyVl  dx,  (8) 

where  *  indicates  the  complex  conjugate. 
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The  wavelet  transform  therefore  projects  the  L2(9l)  space  of  finite  energy  functions  into 
the  L2(9{x9l'*')  space  of  wavelet  coefficients  having  a  measure  da  db/cfl,  which  is  the  Haar 
measure  associated  to  the  affine  group.  Figure  3  shows  five  examples  of  wavelet  analysis 
of  academic  signals;  a  Dirac  spike  (Fig.  3a),  the  superposition  of  two  cosine  functions 
having  different  frequencies  (Fig.  3b),  the  superposition  of  two  cosine  functions  of  very 
different  amplitudes  (Fig.  3c),  a  tchirp  (Fig.  3d),  a  Gaussian  white  noise  (Fig.  3e),  and 
finally  a  tchirp  in  the  presence  of  a  strong  noise  (Fig.  3f). 

From  the  w-welet  coefficients  fib, a),  one  is  able  to  reconstruct  the  function^^)  using  the 
inverse  wa  jlet  transform,  defined  as 

with 

C(vr)  =  2;E£|vr(*)f^, 

a  finite  valued  coefficient  given  by  the  admissibility  condition  (1). 

One  verifies  that  the  wavelet  transform  conserves  energy  (as  the  Plancherel  identity  for  the 
Fourier  transform),  namely  that 

If  the  functiony(x)  belongs  to  the  functional  space  l2(SR),  and  if  the  wavelet  is  regular 
enough  and  therefore  well  localized  in  Fourier  space  (5),  the  wavelet  analysis  may  be 
interpreted  as  a  pass-band  filter  with  dklk  being  constant  (Fig.  Id)  ; 

Ma)  =  :r^r  mw\ak)e'^dk.  (11) 

iKNia)^— 

The  extension  of  the  continuous  wavelet  transform  to  analyze  signals  in  n  dimensions  has 
been  done  by  Murenzi  (1989),  considering  in  this  case  the  Euclidean  group  with  dilations. 
The  generation  of  the  wavelet  family  4^a^,i,(jr)  is  obtained  by  translation  (vector  b), 
dilation  (parameter  a)  and  rotation  (corresponding  to  the  operator  r  defined  in  5R”),  such 
as 


=  NiarW{aY\x-b)). 


(12) 
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the  signal 


the  wavelet  coefficient 
modulus 


the  wavelet  coefficient 
phase 


a.  Dirac  spike 


b.  The  superposition  of  two  cosine  functions  having 
different  frequencies 
cos(0  +  cos(1.68/) 


c.  The  superposition  of  two  cosine  functions  of  very 
different  amplitudes 
cos(/)  +  0.02  cos(2l) 


Figure  3.  Wavelet  transforms  of  several  academic  signals  using  a  Mtrlet  wavelet  (continued  next  page). 


the  wavelet 

modulus 


the  wavelet  coe 
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For  9{2,  r  is  the  rotation  matrix. 


cosfl  -sin  6 
Sind  cosd 


(13) 


with  6  the  rotation  angle. 


In  n  dimensions  the  admissibility  condition  becomes 


C(vr)  =  (2)r)-££|vr(ifc)| 


idTk 


(14) 


The  analysis  and  synthesis  are  then 

/(fl,r,h)  =  £  ••£/(x)'F*,^(i)d"i 

The  energy  conservation  still  holds; 


rv|2  dadrcTb 


cT^ 


(15) 

(16) 


(17) 


Holschneider  (1988)  has  shown  that  one  can  reconstruct  the  function^^x)  from  its  wavdet 
coeffidents  /(h.a)  by  using  any  other  function  ^x),  which  verifies  a  modified 
admissibility  condition  such  as 


(18) 

This,  for  instance,  allows  us  to  reconstructX^)  l>y  a  simple  summation  of  all  wavdet 
coefficients  along  the  verticals  b  -  constant.  This  in  fact  corresponds  to  using  a  Dirac 
function  as  the  function  ^x)  to  reconstruct  the  signal,  which  gives 


with 


C(V)  =  >^£$r(it)£<oo. 


(19) 
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3  PROPERTffiS  OF  THE  CONTINUOUS  WAVELET  TRANSFORM 
3  .1  Covariance  by  Translation  and  Dilation 

One  property  of  the  continuous  wavdet  transform,  which  is  lost  in  the  case  of  the 
orthogonal  wavdet  transform,  is  its  covariance,  by  both  translation,  i.e.,  shift  by  Xg 

W[/(x-xb)]  =  /(h-xi,.fl)  (20) 

with  W the  continuous  wavelet  transform  operator,  and  dilation,  i.e.,  undo-  scale  changes 
by  a  factor  X 

3.2  Linearity 

The  continuous  wavelet  transform  is  a  linear  transform;  therefore  we  have  the  following 
superposition  principle; 

(22) 

with  a  and  b  two  arbitrary  constants. 

3.3  Locality  in  Both  Space  and  Scale 

The  localization  of  wavelets  by  both  position  b  and  scale  a  ^dds  both  values  from  the 
wavelet  coefficients.  This  is  not  the  case  with  the  Fourier  coefficients  because  the  basis 
functions  are  nonlocal;  a  given  Fourier  coefficirat  therefore  depends  on  the  behaviour  of 
the  whole  signal.  On  the  contrary  a  given  wavdet  coeffident  f{b^,a^)  does  not  dq)^  on 
the  value  of  the  signal  outside  the  so  called  'influence  cone’  localized  in  + &bia,  with 
depending  on  the  support  of  the  wavelet  (Fig.  4a).  Likewise  the  wavdet  coeffidents  at 
a  given  scale  Qq  depend  only  on  the  spectral  behaviour  of  the  signal  in  the  bandwidth 
[*iiiii/«o.*inax/«o]  with  *^111  ^  ^max  by  the  support  of  y  (Fig.  4b).  The  support  of 
yf  is  defined  as  the  region  where  y  is  larger  than  a  given  value,  because  wavdet  y  has  at 
least  an  exponential  decay. 

3.4  Characterization  of  the  Local  Regularity  of  a  Function 

One  of  the  most  usefiil  properties  of  the  waveld  transform  in  analyzing  turbulent  flows  is 
the  &ct  that  the  local  scaling  of  the  wavelet  coeffidents  computed  in  L>  norm,  i.e.,  with 
the  normalization  N(ay=a  in  (7),  allows  us  to  characterize  the  r^ularity  of  the  signal 


142 


FAROE,  GOIRAND,  AND  PHILIPOVITCH 


(Holschneider  1988)  and  (Jaffard  1989).  Thus,  if  tTf  IdxT  exists,  i.e.,  if /is  m  times 
continuously  differentiable  in  Xq,  then 


(23) 

when  a  tends  to  0. 

If  /  €  A“(ji^),  the  space  of  Lipschitz  functions  having  exponent  -l<a<l,  which  are 
continuous  functions  non  differentiable  in  Xq  ,  such  that 


/(x)-/(Xo)^C|x-JCof 


(24) 


with  constant  OO.  Then 


/(Xo.a)~a“ 


(25) 


when  a  tends  to  0. 


Thus  the  behaviour  of  the  wavelet  coefficients  /  at  x^  in  the  limit  a->  0  measures 
the  local  regularity  of  the  function  /in  x^,  which  is  given  by  the  slope  of  the  modulus  of 
(Xo,a)  represented  in  log-log  coordinates.  For  instance,  the  wavelet  coefficients  computed 
in  norm  L*  of  a  function  presenting  a  Lipschitz  ^gularity  a  in  Xg  will  diverge  in  the  very 
small  scale  limit  (Fig.  Sa),  while  those  of  a  function  which  is  regular  in  Xg  will  tend  to  zero 
in  the  same  limit  (Fig.  Sb). 

4  ANALYSIS  OF  TWO-DIMENSIONAL  TURBULENT  FLOWS 


“In  the  last  decade  we  have  experienced  a  conceptual  shift  in  our  view  of  turbulence.  For 
flows  with  strong  velocity  shear...  or  other  organizing  characteristics,  many  now  feel  that 
the  spectral  description  has  inhibited fundamental  progress.  The  next  "El  Dorado"  lies  in 
the  mathematical  understanding  of  coherent  structures  in  weakly  dissipative  fluids:  the 
formation,  evolution  and  interaction  of  metastable  vortex-like  solutions  of  nonlinear 
partial  differential  equations...  ’’  Norman  Zabusky  (1984). 

As  Norman  Zabusky  stated,  it  is  essential  before  modelling  turbulent  flows  to  understand 
the  dynamical  role  of  coherent  structures  and  analyze  their  contribution  to  the  different 
nonlinear  interactions.  Because  the  Fourier  modes  contain  nonlocal  information,  we  are 
unable  to  discriminate  the  role  of  coherent  structures  and  we  cannot  separate  the  coherent 
structures  from  the  rest  of  the  flow.  However,  this  local  spectral  analysis  becomes  possible 
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Figure  S.  Analysis  of  the  local  regularity  of  a  function /in  Xq  (given  the  slope  of  the  modulus  of 

/(Xg.a)  rq>resented  in  log*log  coordinates). 
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when  using  the  wavelet  transform  and  with  it  we  can  devise  new  types  of  diagnostics. 

After  defining  them,  we  will  apply  them  to  analyze  some  vorticity  fields  corresponding  to 
long-time  evolution  of  a  forced  two-dimensional  flow,  computed  with  a  resolution  128^. 

4. 1  The  Wavelet  Coefficients 

If  we  denote  the  position  as  b,  the  scale  as  a,  and  the  angle  as  6,  the  wavelet  coefficients 
computed  in  LP  norm  are 

f{a,e,b)  =  (26) 

with 

.  ,  ,  -  cos©  -sin© 

=  (27) 

If  N{a)  =  the  wavelet  coefficients  are  in  norm  and  the  squared  wavelet  coefficients 
correspond  to  the  local  energy  density  of  the  signal  at  location  b,  scale  a  and  direction  6. 

If  N(a)  =  a,  the  wavelet  coefficients  are  in  L*  norm  and  in  this  case  the  local  scaling  of  the 
wavelet  coefficients  gives  information  on  the  local  regularity,  or  the  Lipschitz  exponent  in 
the  case  of  discontinuities,  of  the  signal  at  location  b,  scale  a  and  direction  6. 

In  Figure  6  we  show  the  ID  continuous  wavelet  analysis  along  a  cut  done  in  a  two- 
dimensional  turbulent  vorticity  field.  The  wavelet  coefficients  are  computed,  either  in 
norm  (Fig.  6a),  or  in  L'  norm  (Fig.  6b),  using  the  Morlet  wavelet  with  Aq=5. 

In  Figure  7  we  show  the  2D  continuous  wavelet  analysis  of  a  two-dimensional  turbulent 
vorticity  field.  The  wavelet  coefficients  are  computed  in  norm  at  three  different  scales, 
namely  32  pixels  (Fig.  7b),  16  pixels  (Fig.  7c),  and  2  pixels  (Fig.  7d),  using  the  isotropic 
Marr  wavelet  (in  this  case,  there  is  no  angular  dependence  of  the  wavelet  coefficients 
resulting  from  to  the  wavelet  isotropy). 

4.2  The  Intermittency  Factor 


The  intermittency  factor  is  given  by  the  wavelet  coefficients  renormalized  by  the  space 
averaged  energy  at  each  scale,  such  that 


/(fl.b)  = 


_ |7(a>e.S)f _ 


(28) 


Figure  6.  Continuous  wavelet  analysis  of  a  one-dimensional  cut  done  in  a  two-dimensional  tuiljulent 
voiticity  field. 
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It  gives  information  on  the  space  variance  of  the  energy  spectrum,  namely  if  /(a,b)=l  the  field  is 
homogeneous  and  there  is  no  space  variance  of  the  energy  at  scale  a.  If  I  {a,  b)  is  large,  the  field  is 
intermittent,  namely  all  the  energy  contribution  at  scale  a  comes  from  a  few  very  excited  regions, 
while  the  rest  of  the  field  has  little  energy  at  this  scale. 

Figure  8  shows  the  intermittency  factor  computed  at  three  different  scales,  namely  32 
pixels  (Fig.  8b),  8  pixels  (Fig.  8c),  and  2  pixels  (Fig.  8d)  using  the  isotropic  Marr  wavelet 
(there  is  no  angular  dependence  of  the  wavelet  coefficients  resulting  from  the  wavelet 
isotropy  in  this  case). 


4.3  The  Local  Energy  Spectrum 


The  local  energy  spectrum  is  defined  from  the  wavelet  coefficients,  such  that 


£(a,^o) 


fJ\f(a,e,bo)dd 


(29) 


Figure  9  shows  the  local  energy  spectra  (Fig.  9d)  computed  by  integrating  in  space  the 
Marr  wavelet  coefficients  after  segmenting  the  vorticity  field  (Fig.  9a)  into  three  different 
regions  using  the  Weiss  criterium  (Weiss  1981):  the  elliptical  region  corresponding  to  the 
cores  of  the  coherent  structures  (Fig.  9b),  the  parabolic  region  corresponding  to  the  shear 
layers  at  the  periphery  of  the  coherent  structures  (Fig.  9c),  and  the  hyperbolic  region 
corresponding  to  the  vorticity  filaments  of  the  incoherent  background  flow.  We  observe 
that  the  elliptic  region  scales  as  the  parabolic  region  as  while  the  hyperbolic  region 
scales  as  .  Therefore  the  more  coherent  the  region  is,  the  steeper  its  spectrum,  whereas 
an  incoherent  region,  such  as  the  background  flow,  is  much  more  homogeneous  and  has  a 
flatter  spectrum — similar  to  noise. 

5.  FILTERING  OF  TWO-DIMENSIONAL  TURBULENT  FLOWS 
USING  CONTINUOUS  WAVELETS 

Because  the  wavelet  transform  is  invertible  it  is  always  possible  to  select  a  subset  of  the 
coefficients  and  reconstruct  a  filtered  version  of  the  field  from  them.  We  propose  several 
filtering  techniques  to  extract  coherent  structures  from  the  background  vorticity  in  two- 
dimensional  turbulent  flows.  The  first  one  consists  of  discarding  all  wavelet  coefficients 
outside  the  influence  cones  (Fig.  4a)  attached  to  the  local  maxima  of  the  vorticity  field  that 
corresponds  to  the  coherent  structures’  cores.  The  second  method  consists  of  discarding 
all  wavelet  coefficients  smaller  than  a  given  threshold  that  depends  on  the  quantity  of 
enstrophy  we  want  to  retain  in  the  filtered  vorticity  field. 

Figure  10  shows  the  extraction  of  one  coherent  structure,  done  by  filtering  all  wavelet 
coefficients  outside  the  influence  cone  attached  to  the  center  of  this  coherent  structure. 


d.  small  scale,  2  pixels  (min  0,  max  44) 
Figure  8.  The  intermittency  factor  computed  using  the  Marr  wavelet. 


b.  The  elliptical  region 
corresponding  to  the  coherent 
structures 


c.  The  parabolic  region 
corresponding  to  the  shear 
layers  at  the  periphery  of  the 
coherent  structures 


d.  The  hyperbolic  region 
correqxrnding  to  the  vorticity 
lilaitients  of  the  incoherent 
backgrouiKl  flow 


Figure  9.  Local  energy  spectra  computed  from  the  wavelet  coefficients  after  segmenting  the  vorticity  field 
into  three  difierent  regions. 
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c.  the  vorticity  field  without  the  coherent  structure 


d.  the  energy  spectra  of  the  three  previous  fields 


Figure  10.  Extraction  of  one  coherent  structure,  done  by  filtering  all  wavelet  coefficients  outside  tte 
influence  cone  attached  to  the  center  of  this  coherent  structure,  before  computing  the  inverse  wavelet 
transform. 
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before  computing  the  inverse  wavelet  transform.  We  display  the  complete  vorticity  field 
(Fig.  10a),  the  coherent  structure  alone  (Fig.  10b),  the  vorticity  field  without  the  coherent 
structure  (Fig.  10c),  and  the  energy  spectra  of  the  three  previous  fields  (Fig  lOd) 

Figure  1 1  shows  the  extraction  of  the  40  most  excited  coherent  structures,  done  by 
filtering  all  wavelet  coefficients  outside  the  influence  cones  attached  to  the  centers  of  these 
coherent  structures,  before  computing  the  inverse  wavelet  transform.  We  display  the 
complete  vorticity  field  (Fig.  1  la),  the  40  coherent  structures  alone  (Fig.  1  lb),  the 
vorticity  field  without  the  coherent  structures  (Fig.  1  Ic),  and  the  energy  spectra  of  the 
three  previous  fields  (Fig.  1  Id). 

Figure  12  shows  the  extraction  of  all  excited  coherent  structures,  done  by  filtering  all 
wavelet  coefficients  smaller  than  a  given  threshold  and  then  computing  the  inverse  wavelet 
transform.  We  display  the  complete  vorticity  field  (Fig.  12a),  the  coherent  structures  alone 
(Fig.  12b),  the  vorticity  field  without  the  coherent  structures  (Fig.  12c),  and  the  energy 
spectra  of  the  three  previous  fields  (Fig.  12d). 

As  seen  with  the  local  energy  spectra, these  filtering  techniques  show  again  that  the 
spectral  behaviour  depends  on  the  region  of  the  flow,  with  a  tendency  to  scale  around 
near  the  cores  of  the  coherent  structures,  between  Ir*  and  kr^  at  their  periphery,  and 
around  kr^  in  the  background. 

6.  COMPRESSION  OF  TWO-DIMENSIONAL  TURBULENT  FLOWS 
USING  WAVELET  PACKETS 

Wavelet  packets  represent  a  ^-^mily  of  orthogonal  bases  that  unifies  wavelets  with  Dirac, 
Fourier  and  wavepacket  functions,  affording  increased  flexibility  in  tiling  the  information 
plane,  because  now  each  element  of  the  basis  is  parametrized  independently  in  position  b, 
scale  a  and  wavenumber  k  (cf  Coifinan  et  al.,  1992).  For  a  given  signal  sampled  on  N 
points  the  wavelet  packet  algorithm  generates  2^  possible  orthogonal  bases  and  then 
selects  the  one  that  minimizes  the  number  of  coefficients  having  significant  contributions 
to  the  total  signal.  In  this  sense,  the  wavelet  packet  algorithm  defines  the  most  efficient 
basis,  so  called  the  Best  Basis,  upon  which  to  expand  a  given  signal.  If  the  flow  is 
dominated  by  point  vortices,  then  it  is  optimally  represented  using  the  Dirac  grid  point 
basis,  and  the  output  of  the  wavelet  packet  algorithm  will  reflect  this.  On  the  contrary,  if 
the  flow  is  dominated  by  waves,  then  it  is  optimally  represented  using  the  Fourier  basis, 
and  the  output  of  the  wavelet  packet  algorithm  will  again  reflect  this.  If  the  flow  behaviour 
is  in  between  these  two  extreme  situations,  other  bases  will  be  more  appropriate  and  the 
wavelet  packet  algorithm  will  give  us  the  Best  Basis  in  which  the  vorticity  field  can  be 
represented  with  the  smallest  number  of  significant  coefficients.  The  computation  of  the 
Best  Basis  for  a  signal  sampled  on  N  points  is  done  in  ^.log2A^  operations,  while  the 
^construction  of  the  signal  from  its  projection  onto  the  Best  Basis  is  done  in  N 
operations. 
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Figure  13  shows  the  compression  of  a  two-dimensional  vorticity  field  using  its  wavelet 
packet  coefficients  with  three  different  compression  ratios.  For  a  compression  by  2  (Fig. 

13  a)  we  split  the  field  into  the  50%  strongest  wavelet  packet  coefficients  and  the  50% 
weakest  wavelet  packet  coefficients.  Then  for  a  compression  by  20  (Fig.  13b)  we  split  the 
field  into  the  5%  strongest  wavelet  packet  coefficients  and  the  95%  weakest  wavelet 
packet  coefficients,  and  for  a  compression  by  200  (Fig.  13c)  we  split  the  field  into  the 
0.5%  strongest  wavelet  packet  coefficients  and  the  99.5%  weakest  wavelet  packet 
coefficients.  For  each  of  the  three  compression  ratios  we  display  the  uncompressed  field 
with  its  energy  spectrum,  the  compressed  field  with  its  energy  spectrum,  and  the  discarded 
field  with  its  energy  spectrum.  These  results  have  been  obtained  in  collaboration  with 
Meyer,  Pascal  and  Wickerhauser  and  are  extensively  discussed  in  Farge  et  al.  (1992). 

With  these  compression  techniques  we  find  as  before  that  the  spectral  behaviour  depends  on  the 
region  of  the  flow  we  analyze,  with  a  tendency  to  scale  around  kr^  near  the  cores  of  the  coherent 
structures,  around  Ic*  at  their  periphery,  and  around  in  the  background. 

7  CONCLUSION 

Nowadays  turbulence  is  commonly  viewed  from  one  of  two  alternative  perspectives, 
depending  upon  which  side  of  the  Fourier  transform  one  looks  from.  In  physical  space,  we 
observe  coherent  vortices  and  wonder  if  there  is  universality  in  their  structure  and 
interactions.  In  Fourier  space,  we  see  transfers  of  energy  and  enstrophy  between  different 
scales  of  motion  and  ask,  for  example,  if  the  slope  of  the  energy  spectrum  is  universal.  The 
selection  of  bases  in  which  turbulence  may  be  examined  must  be  extended  if  these 
perspectives  are  to  be  effectively  reconciled.  Through  the  use  of  wavelets  and  wavelet 
packets,  we  have  constructed  a  class  of  bases,  which  includes  grid  point  and  Fourier 
representations  as  special  cases,  from  which  we  select  the  basis  which  is  optimal  for  a 
given  flow,  namely  the  one  which  compresses  the  most  the  information  while  keeping 
track  of  the  behaviour  of  the  flow  in  both  space  and  scale. 

With  such  a  wavelet  or  wavelet  packet  representation  we  can  compute  a  local  energy 
spectrum.  Using  the  continuous  wavelet  transform,  we  have  shown  that  different  regions 
of  the  flow  present  different  slopes  for  the  local  energy  spectrum.  Clearly  the  Fourier 
transform  is  unable  to  detect  these  different  spectral  behaviours  which  vary  in  space,  while 
the  wavelet  transform  is  here  the  appropriate  tool.  Typically  we  have  observed  that  the 
cores  of  the  coherent  structures,  which  correspond  to  the  elliptic  regions,  scale  as  the 
shear  layers  around  the  coherent  structures,  which  correspond  to  the  parabolic  regions, 
scale  as  Ic*,  while  the  vorticity  filaments  in  the  background,  which  correspond  to  the 
hyperbolic  regions,  scale  as  kr'^.  From  this  result  we  infer  that  the  variation  of  the  Fourier 
spectral  slope  we  commonly  observe  for  two-dimensional  flows  may  be  related  to  the 
density  of  coherent  structures  which  varies  depending  on  the  initial  conditions  and 
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The  uncompressed  vorticity  field 
and  its  Fourier  spectrum 


i  m  *  9  2»M  0  CM.*6U 


The  vorticity  field  reconstructed  from  the  jffg  vorticity  field  reconstructed  from  the 

50  %  strongest  wavelet  packet  coefficients  50  %  weakest  wavelet  packet  coefficients 


Figure  13a.  Compression  of  a  twoHlimensional  vorticity  field  using  its  wavelet  packet  coefficients, 
compression  by  a  factor  2;  (top)  the  uncompressed  field  and  its  energy  spectrum,  (center  left  and  lower 
left)  the  compressed  field  and  its  energy  spectrum,  (center  right  and  lower  right)  the  discarded  field  and 
its  energy  qiectrum.  The  visualisation  was  done  in  collaboration  with  Jean-Francois  Colonna. 
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Figure  13b.  Compression  of  a  two-dimensional  vorticity  field  using  its  wavelet  packet  coefficients, 
compression  by  a  factor  20;  (top)  the  uncompressed  field  and  its  energy  spectrum,  (center  left  and  lower 
left)  the  compressed  field  and  its  energy  spectrum,  (center  right  and  lower  right)  the  discarded  field  and 
its  energy  spectrum.  The  visualisation  was  done  in  collaboration  with  Jean-Francois  Colotma. 
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The  uncompressed  voriiciry  field 
and  its  Fourier  spectrum 
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The  voracity  field  reconsO'ucted  from  the 
995  %  weakest  wavelet  packet  coefficients 
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Figure  13c.  Compression  of  a  two-dimensional  vorticity  field  using  its  wavelet  packet  coefficients, 
compression  by  a  factor  200;  (top)  the  uncompressed  field  and  its  energy  spectrum,  (center  left  and  lower 
left)  the  compressed  field  and  its  energy  spectrum,  (center  right  and  lower  right)  the  discarded  field  and 
its  energy  spectrum.  The  visualisation  was  done  in  collaboration  with  Jean-Francois  Colonna. 
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on  the  forcing.  If  this  is  true  we  may  hope  that  the  local  scaling  of  the  different  regions 
may  be  universal  enough  in  order  to  be  able  to  model  their  behaviour,  each  region  then 
having  its  own  parametrization. 

Using  the  orthogonal  wavelet  packet  transform,  we  have  shown  that  the  significant 
coefficients  correspond  to  the  coherent  structures,  while  the  weak  coefficients  correspond 
to  the  vorticity  filaments  which  are  only  passively  advected  by  the  coherent  structures. 
One  possible  application  of  the  wavelet  packet  algorithm  is  to  apply  it  from  time  to  time 
during  a  numerical  simulation,  in  order  to  separate  regions  with  highly  active  small  scales, 
which  need  a  better  grid  resolution,  from  regions  with  inactive  small  scales,  which  do  not 
contribute  much  to  the  dynamics  and  can  either  be  discarded  or  modelled.  Indeed  the 
wavelet  packet  Best  Basis  seems  to  distinguish  the  low-dimensional,  dynamically  active 
part  of  the  flow  from  the  high-dimensional,  passive  components.  It  gives  us  some  hope  of 
drastically  reducing  the  number  of  degrees  of  freedom  necessary  to  the  computation  of 
two-dimensional  turbulent  flows. 
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THE  NUMERICAL  INVERSE  SCATTERING  TRANSFORM: 

NONLINEAR  FOURIER  ANALYSIS  AND 

NONLINEAR  FILTERING  OF  OCEANIC  SURFACE  WAVES 


A.  R.  Osborne 

Istituto  di  Fisica  Generale  deH'Universitk,  Via  Pietro  Giuria  1,  10125  Torino,  Italy 

ABSTRACT 

Nonlinear  Fourier  analysis  is  discussed  as  it  arises  from  the  exact  spectral  solution  to 
large  classes  of  nonlinear  wave  equations  which  are  integrable  by  the  inverse  scattering 
transform  (1ST).  The  approach  may  be  viewed  as  a  generalization  of  the  ordinary,  linear 
Fourier  transform  or  Fourier  series.  Numerical  methods  are  discussed  which  allow  for 
implementation  of  the  approach  as  a  tool  for  the  time  series  analysis  of  oceanic  wave 
data.  I  specifically  consider  the  case  for  shallow  water,  where  integrable  nonlinear  wave 
motion  is  governed  by  the  Korteweg-deVries  equation  with  periodic/qua'^i-periodic 
boundary  conditions.  Numerical  procedures  given  herein  allow  the  computation  of  a 
nonlinear  Fourier  series  for  a  measured  time  series.  The  nonlinear  oscillation  modes  of 
KdV  obey  a  linear  superposition  law,  just  as  do  the  sine  waves  of  a  linear  Fourier  series. 
However,  the  KdV  basis  functions  themselves  are  highly  nonlinear,  undergo  nonlinear 
interactions  with  each  other  and  are  distinctly  non  sinusoidal.  I  analyze  surface  wave  data 
from  the  Adriatic  Sea  and  apply  the  concept  of  nonlinear  filtering  to  enhance 
understanding  of  nonlinear  interactions. 


INTRODUCTION 

This  paper  summarizes  a  new  numerical  approach  for  the  nonlinear  Fourier  analysis 
of  space  and  time  series  of  complex,  nonlinear  wave  trains.  The  method,  based  upon  the 
(periodic/quasi-periodic)  inverse  scattering  transform  (1ST),  is  a  kind  of  nonlinear 
generalization  of  the  ordinary,  linear  Fourier  transform.  I  focus  on  nonlinear  wave  motion 
for  shallow-water  waves  as  governed  by  the  Korteweg-deVries  (KdV)  equation.  1ST  may 
be  exploited  to  determine  the  numerical  inverse  scattering  transform  (NIST)  spectrum  of 
a  measured  or  computed  wave  train  which  is  assumed  to  be  periodic  (or  quasi-periodic)  in 
space  or  in  time.  The  approach  may  also  be  applied  to  numerically  construct  complex 
solutions  to  the  KdV  equation.  I  build  on  previous  successes  in  the  application  of  the 
periodic  scattering  transform  to  the  analysis  of  computer  generated  or  experimentally 
measured  data  [Bishop  et  al.,  1986;  Osborne  and  Bergamasco,  1985, 1986;  Osborne  and 
Segre,  1990;  Terrones  et  al.,  1990;  Flesch  et  al.,  1991;  Osborne  et  al.,  1991;  Osborne, 
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1991a,  1991b;  McLaughlin  and  Schober,  1992;  Osborne,  1993].  In  particular  I  analyze 
measured  wave  data  obtained  in  the  Adriatic  Sea  on  a  fixed  offshore  platform  in  16.5  m 
of  water,  about  10  km  from  Venice,  Italy.  This  paper  describes  some  of  the  recent  work 
done  in  collaboration  with  L.  Cavaleri  [Osborne  et  al.,  1991;  Osborne  and  Cavaleri, 
1993].  It  is  hoped  that  the  results  of  this  paper  will  complement  other  recent  work  in  the 
propagation  of  nonlinear  shallow  water  waves  [Elgar  and  Guza,  1986]. 


THE  KdV  EQUATION  AND  PERIODIC  INVERSE  SCATTERING  THEORY 

The  Kortweg-deVries  equation  describes  (among  many  other  physical  applications) 
the  motion  of  small,  finite-amplitude  nonlinear  wave  trains  in  shallow  water.  KdV  was 
the  first  of  many  nonlinear  wave  equations  to  be  completely  integrated  by  what  is  now 
called  the  inverse  scattering  transform  [Zakharov  et  al.,  1980;  Ablowitz  and  Segur,  1981; 
Dodd  et  al.,  1982;  Newell,  1985;  Degasperis,  1991]. 

The  dimensional  form  for  the  (space-like)  KdV  equation  is  given  by  [Whitham,  1974; 
Miles,  1980]: 


V,+c,Ti,  +  ar]r],+p7j^=0  (1) 

where  Tl(x,t)  is  the  wave  amplitude  as  a  function  of  space  x  and  time  t.  For  shallow 
water  wave  motion  the  constant  coefficients  of  KdV  are  given  by  Co  = 
a  =  3Col  2h  and  P  =  cji^  /  6.  Eq.  (1)  has  the  linearized  dispersion  relation 
(0  =  Cok-  Pk? ;  g  is  the  acceleration  of  gravity,  Cq  is  the  linear  phase  speed,  and  h  is  the 
water  depth.  Subscripts  with  respect  to  x  and  t  refer  to  partial  derivatives.  KdV  solves  the 
Cauchy  problem:  given  the  spatial  behavior  of  the  wave  train  at  f  =  0,  rf{x,0),  (1) 
determines  the  motion  for  all  space  and  time  thereafter,  ri{x,t).  Here  we  use  periodic 
boundary  conditions  so  that  r](x,t)  =  t\{x  +  L,/),  for  L  the  spatial  period  of  the  wave 
train. 

The  most  common  experimental  situation  is  to  record  data  as  a  function  of  time  at  a 
single  spatial  location.  The  reasons  are  often  economical,  e.g.,  the  measurement  of  time 
series  requires  a  single  wave  staff  or  pressure  recorder;  the  measurement  of  space  series 
requires  remote  sensing  capability.  These  considerations  motivate  the  need  to  determine 
the  scattering  transform  of  a  time  series,  Tj(0,t) .  To  this  end  one  may  apply  the  time-like 
KdV  equation  (TKdV)  [Karpman,  1974;  Aolowitz  and  Segur,  1981]; 

V.+c^'ri,  +  otr]V,+P'rim=0  (2) 

where  Cq'  =  I/cq,  oc'  =  —a  /  cf  and  P'  =  —p/c„',  (2)  has  the  linearized  dispersion  relation 
k  =  (o  /  Co+ip  I  Co*)(0^ .  TKdV  solves  a  boundary  value  problem:  given  the  temporal 
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evolution  7]{0j)  at  a  fixed  spatial  location  Jt  =  0,  (2)  determines  the  wave  motion  over  all 
space  as  a  function  of  time,  T](x,t).  Periodic  boundary  conditions  iri(x,t)  =  rfixj  +  T)) 
are  assumed  herein  in  order  to  be  consistent  with  linear  Fourier  algorithms  (discrete  and 
fast  Fourier  transforms).  Due  to  recent  advances  in  numerical  methods  TKdV  may  now 
be  routinely  applied  to  the  time  series  analysis  of  experimental  data  [Osborne,  1991a; 
Osborne  et  al.,  1991;  Osborne  and  Segre,  1990]. 

All  solutions  of  (1)  may  be  easily  converted  to  all  solutions  of  (2)  by  simple 
transformations  given  elsewhere  [Osborne,  1983;  Osborne,  1993].  Hence  the  scattering 
transform  of  (2)  may  be  easily  expressed  in  terms  of  the  scattering  transform  of  (1).  For 
present  purposes  it  is  only  necessary  to  note  that  given  the  1ST  for  (1),  the  1ST  for  (2) 
may  be  easily  determined.  Therefore,  I  give  herein  only  the  mathematical  development  of 
1ST  for  (1). 

According  to  the  periodic  inverse  scattering  transform  the  solution  to  the  periodic 
KdV  equation  (1)  may  be  written  as  a  linear  superposition  of  nonlinearly  interacting, 
nonlinear  waves  called  hyperelliptic  functions,  Pj(x;Xo,0): 

N 

XT)(x,t)  =  -E,  +'^l2pj(x;x„,t)-  E2j  -  (3) 

j=i 

The  constant  parameter  X=al  6)3.  This  is  the  first  of  the  so-called  trace  formulae  for  the 
KdV  equation  [Dubrovin  and  Novikov,  1974;  Flaschka  and  McLaughlin,  1976]  and  may 
be  interpreted  as  a  kind  of  nonlinear  Fourier  series.  The  constant  parameters  £’2;, 
are  eigenvalues  of  the  “main  spectrum”  of  periodic  theory  as  discussed  in  the  next 
section;  Xq  is  an  arbitrary  base  point  in  the  interval  0<x<L  The  Pj  are  the  nonlinear 
oscillation  modes  of  periodic  KdV,  i.e.,  they  are  analogous  to  the  sine  waves  of  linear 
Fourier  analysis.  The  Pj  spatially  evolve  according  to  the  following  system  of  coupled, 
nonlinear,  ordinary  differential  equations: 


dpj  ^  2iajR^'\pjy 
k»\ 


(4) 


where 


2N*l 

(5) 

t=i 

The  Cj  =  ±1  are  the  signs  of  the  square  root  of  RiPj).  The  Pj  dynamically  evolve  on 
two-sheeted  Riemann  surfaces;  the  branch  points  connecting  the  surfaces  are  referred  to 
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as  “band  edges”  and  are  denoted  by  the  Ey  and  ^2/4-1  •  The  spatially  and  temporally 
varying  //y  evolve  inside  an  “open  band,”  e.g.,  in  the  interval  ^2i  ^  ^2y+l>  and 

oscillate  between  these  limits  as  a  function  of  x  and  /,  as  will  be  demonstrated  graphically 
below.  When  a  n,  reaches  a  band  edge  (either  Ejj  or  the  sign  (Tj  changes  and  the 
motion  leaps  to  the  other  Riemann  sheet.  This  fact,  together  with  the  strong  nonlinear 
coupling  occurring  among  the  presented  considerable  difficulties  for  Osborne  and 
Segre  [1990]  in  numerical  integrations  of  (4).  These  difficulties  have  been  largely 
circumvented  by  the  methods  given  herein  for  the  time  series  analysis  of  nonlinear  wave 
trains. 

The  temporal  evolution  of  the  fij  is  given  by  the  following  differential  equations: 

^  =  -2[A77(x,r)-2/i,]^  (6) 

dt  dx 

where  Xr]{x,t)  is  given  by  (3).  The  space  (4)  and  time  (6)  ODEs  evolve  the  lij{x,t)  (the 
nonlinear  oscillation  modes  of  KdV)  and  the  nonlinear  Fourier  series  (3)  allows  one  to 
construct  general  solutions  to  the  KdV  equation.  In  what  follows  I  describe  methods  for 
numerically  computing  the  oscillation  modes  jUy(x,0)  at  a  particular  instant  of  time,  t  =  0. 
The  requisite  numerical  methods  are  then  christened  nonlinear  Fourier  analysis 
procedures  for  space  or  time  series  [Osborne,  1991a]. 

Generally  speaking  I  refer  to  the  numerical  determination  of  the  main  spectrum 
(Ei;  1  </<  2/V+ 1 )  and  the  aia/7/ary  rpccrritm  (/t,(0,0),(Ty  =  ±1;1  <  y  <  N)  as  the  d/rccr 
scattering  transform  (see  details  in  the  Section  below).  The  computation  of  the 
hyperelliptic  functions  p.j{x,t)  as  solutions  of  the  nonlinear  ODEs  (4)-(6)  and  the 

construction  of  solutions  of  the  KdV  equation  by  the  trace  formula  (3)  constitutes  the 
inverse  scattering  transform.  Herein  I  (a)  discuss  new  numerical  procedures  for  obtaining 
the  direct  scattering  transform  and  (b)  show  that  the  inverse  scattering  transform  as 
obtained  by  numerical  integration  of  (4)-(6)  (e.g.  as  considered  by  Osborne  and  Segre 
[1990])  can  be  replaced  by  a  much  simpler,  more  precise  and  faster  algorithm. 


THE  PERIODIC  INVERSE  SCATTERING  TRANSFORM 

The  spectral  problem  (the  direct  scattering  transform)  for  KdV  (1)  is  the  Schroedinger 
eigenvalue  problem  of  quantum  mechanics: 


yf„+[Xii{x)  +  k^]\lf  =  Q 


ilc^^E) 


(7) 


NONLINEAR  FOURIER  ANALYSIS  AND  FILTERING 


165 


where  T]ix)  =  T}(x,0)  is  the  solution  to  the  KdV  equation  (1)  at  an  arbitrary  time  t  =  0,k 
is  the  spectral  wavenumber.  Periodic  boundary  conditions  are  assumed  so  that  we  take 
Tjixj)  =  77(jc  +  L,t)  for  L  the  period. 

Details  of  the  inverse  scattering  theory  will  not  be  given  here,  but  may  be  found 
elsewhere  [Dubrovin  and  Novikov,  1974;  Dubrovin,  Matveev  and  Novikov  1976; 
Flaschka  and  McLaughlin,  1976;  McKean  and  Trubowitz,  1976].  For  numerical  purposes 
it  is  appropriate  to  consider  a  basis  of  solutions  (c,  s)  of  (7)  such  that 


0 

.0  ij 

The  wronskian  W(c,  5)  =  1  so  that  (c,  s)  is  a  basis  set  of  (1).  The  matrix  a  carries  the 
solution  of  (1)  from  the  point  xiox-^L: 


^c(jc  +  L)  d  ix  +  L)^ 

«12' 

^c(x)  d  ix)'' 

^six  +  L)  s'  (x  +  L)j 

^0^21  «22> 

^six)  s’  ix)^ 

(8) 


(9) 


a  is  often  referred  to  as  the  monodromy  matrix.  This  is  the  fundamental  matrix  of 
periodic  spectral  theory  for  KdV;  a  contains  all  spectral  information  about  KdV  in  the 
wavenumber  domain. 

The  so  called  main  spectrum  of  KdV  consists  of  eigenvalues  £,  that  correspond  to  the 
Bloch  eigenfunctions  of  the  Schroedinger  equation  (7)  for  a  particular  period  L.  The 
auxiliary  spectrum  is  defined  as  the  eigenvalues  for  which  the  eigenfunctions  six)  have 
the  fixed  boundary  conditions  siXo+L)  =  =  0-  To  this  end  one  has  these  specific 

spectral  definitions: 

main  spectrum  {£,;  1  <  i  <  2A/+1 ):  7(^11  +  ®i2)(£)  =  ±1 

auxiliary  spectrum  {;u/,l  <  y  <  N}:  021  (/i)  =  0  (10) 

{tr,}  =  {5g/i[a„(£)-  a22(£)L^,;  1^7^  v}. 

The  eigenvalues  >;  constitute  the  direct  scattering  transform  of  a  wave  train  of 
N  degrees  of  freedom,  I  <i<  2N+1;  1  <j<N.  The  inverse  scattering  tran^orm,  (3)-(6), 
then  allows  for  the  construction  of  complex  wave  train  solutions  of  the  KdV  equation. 
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THE  NUMERICAL  ALGORITHM 

The  numerical  search  for  the  scattering  eigenvalues  suggests  the  need  few 

computing  the  derivatives  of  the  matrix  with  respect  to  the  energy  E.  This  is  because 
one  normally  uses  a  Newtonian  numerical  root-finding  algorithm  to  determine  the 
eigenvalues.  To  achieve  this  goal,  a  matrix  method  for  obtaining  the  evolution  of  the 
eigenfunction  as  a  function  of  x  and  E  for  a  particular  wave  train  i]ix,0)  has  been 
developed.  The  key  to  this  approach  is  the  analytical  estimation  of  derivatives  of  the 
matrix  elements  with  respect  to  E. 

To  this  end  the  spectral  equations  are 


¥«=-Q¥ 


¥z^=-<I¥e-¥ 


(11) 


where  the  subscripts  refer  to  differentiation  with  respect  to x  and  £;  qix)  =  Xt}(x)  +  E. 
Writing  (1 1)  in  four- vector  notation  and  using  a  Taylor  series  expansion  for  the  solution 
to  the  scattering  equations  (11)  one  obtains 


where 


^  Y(x  +  Ak)  '' 

'  ¥M  ' 

Vr,(ac  +  Ax) 

=  H 

¥xix) 

y/E{x  +  Ax) 

¥Eix) 

(12) 


H  = 


(13) 


Each  element  of  H  is  a  two-by-two  matrix.  The  matrix  0  has  zero  for  all  its  elements  and 
the  other  matrices  are  given  by: 


feosUix) 

V 

^^-/rsin(xAx)  cos(vAx)^ 


(14) 


and 
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Axsin(KAx) 

2k 

Axcos(JcAx)  sinlKAx) 
2  2  k 


Arcos(KAx)  sin()cAjc)' 

21?  2k^ 

_  AxsinOcAx) 

2k 


for  K  =  =  (Xtlix)  +  E)''^.  While  k  may  be  either  real  or  imaginary,  the  matrix  Tf  is 

always  real  with  determinant  1.  This  property  is  exploited  in  the  numerical  algorithm 
below. 

As  in  previous  numerical  problems  of  this  type  I  assume  the  wave  train  tj(x)  has  the 
form  of  a  piece  wise  constant  function  with  2M  partitions  on  the  periodic  interval  (0,  L), 
where  the  discretization  interval  is  Ax  =  L/2M  [Osborne,  1991a].  Each  partition  has 
wave  amplitude  77,(1  <n<  2M)  which  is  associated  with  a  discrete  value  of  the  spatial 
variable  x,  =  wAx.  The  four-by-four  scattering  matrix  M  can  then  be  defined: 

M=  06) 

«-M-l 

The  initial  conditions  of  the  basis  (c,  s)  at  the  base  point  Xo  are  given  by 


^  c(xj  ^ 
C'(xj 

Ct(Xo) 


^  siXo)  ' 


From  the  definition  of  the  matrix  a  one  has 


{aj  = 


c(x  +  L)  c’  (x  -t-  L) Y c(x)  c'  (x) 


six  +  L)  s'ix  +  L)Ks{x)  s'{x) 


Thus  at  Xo  one  finds 


— (<Xii  +  OC22)  —  2  (^ii  ^22) 


(X21  —  Mf2 


(20) 
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while  the  derivatives  are  given  by 


+  0^22)  ~  7(^31  +  Mn) 


^^21  _  w 

BE  ""“• 


(21) 

(22) 


Implementation  of  the  Numerical  Algorithm 

Because  k  =  (Xt}(x,0)  +  can  be  either  real  or  imaginaiy,  but  not  complex,  the 
matrix  H  is  always  real.  This  result  allows  implementation  of  an  algorithm  which  is 
entirely  real.  The  following  relations  have  bron  used  in  the  computer  code: 


where 


Tn=T^,= 


I  cos(k'  Ax)  if  >  0 
\cosh(i(:'  Ax)  if  <  0 


f  sin(x'Ax)  , 


7'.2  = 


if  x-2^0 


sinh(«l^  if^<0 


^  _|-ic'sin(K' Ax)  if  >0 
\k' smbfR*  Ax)  if  < 0 


(23) 


(24) 

(25) 


(26) 


and  analogously  for  the  matrix  Tg. 

The  reconstruction  of  complex  solutions  of  the  KdV  equation  by  (3)  (as  well  as 
nonlinear  filtering)  are  carried  out  by  computing  the  auxiliary  spectra  Pjix^  =  x,)  for  the 
2h4  different  base  points  Xo=  x,i^..jc^,  X2, ...  x^.j.  The  approach  is  formally  called  dose 
point  iteration  and  is  carried  out  by  computing  2M  different  monodromy  matrices  (16) 
which  differ  from  each  other  by  a  horizontal  shift  Ax  in  the  wave  train  77,.  This 
procedure  arises  from  the  following  similarity  transformation  which  is  easily  seen  from 
(16): 


M{x,,„E)=H(r},,E)M{x,,E)Hir),,E)-' 


(27) 
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The  latter  expression  relates  the  matrix  M{Xn+u  £)  at  a  base  point  to  the  previously 
computed  matrix  A#(x„.  E)  at  the  base  point  for  a  particular  value  of  £  =  k^.  Values  of 
the  auxiliary  spectra  for  each  Xn  are  computed  from  the  matrices  A#(x„,  £). 

Knowledge  of  the  auxiliary  spectra  at  every  point  x^  allows  reconstruction  of  the  wave 
train  Tj(x^)  via  a  discrete  version  of  (3): 


=  -El  +  ^[2tijixJ-E2j  -  £2,^1] 


(28) 


for  n  =  This  is  a  finite-term  nonlinear  generalization  of  a  Fourier  series 

for  the  discrete  wave  train  tjCjc,).  As  indicated  by  the  notation,  each  nonlinear  oscillation 
mode  [Hj]  implicitly  depends  upon  the  associated  wavenumber  kjOi  the  mode.  The  kj  are 
theoretically  given  by  the  simple  relation  kj  =  jAk,^k  =  2nl  L;  surprisingly  these  are 
exactly  the  same  as  for  the  linear  Fourier  transform,  provided  that  periodicity  is  assumed. 
The  1ST  spectrum  then  consists  of  the  widths  of  the  open  bands  of  the  Floquet 
discriminant,  Oj  =  (£2/+!  -  ^2>)  /  2A ,  graphed  as  a  function  of  kj  (or  fj  for  a  time  scries). 


EXAMPLE  OF  NONLINEAR  FOURIER  ANALYSIS 

To  illustrate  the  numerical  inverse  scattering  transform  in  the  analysis  of  nonlinear 
wave  trains,  in  Figure  1  I  give  the  numerical  construction  of  a  three  degree-of-frccdom 
wave  train.  In  panel  (a)  are  the  hyperelliptic  functions  =  6, 9, 1 1 ;  in  the  present  case 
the  are  constructed  from  a  rather  arbitrary  selection  of  the  eigenvalues  Eij,  Eij+x-  The 
linear  superposition  of  the  three  oscillation  modes  gives  the  solution  to  KdV  as  shown  in 
the  upper  part  of  panel  (a).  Note  that  the  hyperelliptic  oscillation  modes  are  highly  non- 
sinusoidal  in  appearance  due  to  nonlinear  effects.  In  panel  (b)  are  shown  the  amplitudes 
of  the  linear  Fourier  modes  (solid  line)  and  of  the  three  hyperelliptic  modes  (vertical 
lines).  Comparing  these  results  one  concludes  that  only  three  nonlinear  oscillation  modes 
(three  Hjix))  are  required  to  describe  the  motion,  while  instead  the  number  of  linear 

Fourier  modes  is  quite  large  (~  50)  for  this  example. 

ANALYSIS  OF  MEASURED  ADRIATIC  SEA  WAVETRAINS 

I  extend  results  recently  discussed  by  Osborne  et  al.  [1991]  and  Osborne  and  C^valeri 
[1993]  with  regard  to  the  analysis  of  nonlinear  wave  data  obtained  in  a  measurement 
program  in  the  Adriatic  Sea  about  10  km  from  Venice,  Italy.  The  data  were  recorded  in 
16.S  m  of  water  on  the  offshore  research  platform  of  the  Italian  National  Research 
(HOuncil  (Consiglio  Nazionale  delle  Ricerche)  in  a  region  where  the  bottom  slope  is  rather 
small,  e.g.,  ~  1/1000.  A  typical  measured  wave  train,  a  500  point  time  series  with 
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Figure  1 .  Synthesis  of  a  wave  train  solution  to  the  KdV  equation.  In  (a)  three  hyperclliptic  function 
oscillation  modes  are  linearly  superposed  to  give  the  solution  to  KdV.  In  (b)  are  graphed  the  linear  Fourier 
transform  of  the  wave  tfain  (solid  line)  and  the  three  nonlinear  Fourier  amplitudes  (the  aj,  vwtical  lines). 
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temporal  discretization  At  =  1  sec,  is  shown  in  Figure  2(a).  A  data  set  was  selected  for 
which  most  of  the  wave  energy  was  in  the  dominant  direction  of  propagation;  only  3%  of 
the  wave  energy  was  perpendicular  to  this  direction.  This  insured  that  the  waves  were 
essentially  unidirectional,  a  requirement  of  the  KdV  equation  and  consequently  of  the 
inverse  scattering  transform  analysis  given  herein.  The  significant  wave  height  (average 
of  the  highest  one  third  waves)  is  Hs  =  2.9  m  and  the  dominant  period  is  =  10.2  sec. 
The  linear  Fourier  spectrum  is  shown  in  Figure  2(b);  the  results  are  quite  typical  of 
measured  ocean  wave  spectra,  e.g.,  a  central  peak  (around  the  dominant  period)  decays 
rapidly  at  low  frequency  and  has  a  power  law  spectrum  at  high  frequencies. 

It  is  worthwhile  briefly  indicating  how  one  determines  whether  the  KdV  inverse 
scattering  transform  is  appropriate  for  analyzing  a  particular  measured  wave  train.  Clearly 
if  the  physics  of  the  wave  motion  is  not  that  of  the  KdV  equation,  then  the  results  of  an 
1ST  analysis  are  of  dubious  value.  Three  of  the  more  important  tests  for  ascertaining  the 
applicability  of  KdV  for  a  particular  data  set  are  [Osborne  and  Cavaleri,  1993]  (1) 
Determine  whether  the  data  lie  in  the  KdV  region  of  the  Ursell  number  diagram 
[Osborne  1993].  (2)  Determine  if  most  of  the  wave  energy  lies  to  the  left  of 
frdv  =  1.36Co  /  in  the  frequency  domain.  (3)  Determine  if  there  is  little  directional 
spreading  in  the  wave  field.  For  the  data  analyzed  herein  all  three  criteria  are  met  rather 
well.  The  results  of  the  first  test  are  discussed  in  detail  in  Osborne  and  Cavaleri  [1993], 
e.g.  the  (time-like)  Ursell  number,  Ur  =  2gHsTd^l4h^  ~  8;  hence,  the  Adriatic  Sea  waves 
may  be  judged  to  be  mildly  nonlinear.  The  second  test  is  verified  in  Figures  3  and  4. 

Since  only  three  percent  of  the  wave  energy  is  normal  to  the  dominant  wave  direction,  the 
last  criterion  is  also  satisfied  to  good  accuracy.  The  above  constraints  on  the  selection  of 
experimental  data  given  herein  may  be  considered  to  be  rather  conservative;  efforts  are 
underway  to  extend  the  applicability  of  the  present  approach  to  less  severely  restricted 
data  sets  [Osborne,  1993]. 

I  now  discuss  the  nonlinear  Fourier  analysis  of  the  measured  wave  train;  the  Roquet 
discriminant  is  shown  in  Figure  3(a);  this  constitutes  a  graph  of  the  half-trace  of  the 
monodromy  matrix  ( A  =  (aj,  +  a22)/2,  the  first  of  equations  (10))  as  a  function  of 
frequency  squared,  E  =  (;rf)^.  Note  that  the  fluctuations  in  A(E)  are  quite  large  so  that  a 
logarithmic  scale  has  been  used  to  graph  the  function  outside  the  vertical  range 
(-1  <  A  <  1)  (the  graph  is  instead  linear  inside  this  interval).  The  spectrum  is  seen  to 
divide  itself  into  two  widely- separated  regions  of  activity  corresponding  to  solitons  (on 
the  left)  and  radiation  components  (on  the  right).  Since  the  soliton  part  of  the  Floquet 
diagram  is  not  easily  visible  (it  is  too  dense  in  the  domain  E  ~  f2),  this  part  of  the 
spectrum  has  been  graphed  separately  in  Figure  3(b).  Here  the  large  oscillations  to  the  left 
represent  the  soliton  modes  in  the  spectrum.  The  vertical  dotted  line  is  the  so-called 
reference  level  [Osborne  and  Bergamasco,  1986],  which  represents  the  level  upon  which 
the  solitons  propagate  in  physical  space. 


Fourier  Amplitudes  (m) 


Half-Trace  of  Monodromy  Matrix  Half-Trace  of  Monodiomy  Matrix 
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Figure  3.  Floquet  diagram  of  measured  time  series  in  Figure  2(a)  is  shown  in  (a).  The  Floquet  diagram  in 
(a)  has  been  expanded  in  the  soliton  (low  frequency)  part  of  the  spectrum  in  (b)  to  reveal  the  presence  of 
the  reference  level  and  the  solitons  themselves. 
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The  1ST  spectrum  is  given  in  Figure  4(a)  where  the  spectral  components  are  graphed 
as  a  function  of  frequency,  just  as  for  the  linear  Fourier  transform.  The  nonlinear  Fourier 
amplitudes,  Oj  =  (£2>+i  -  £2;)  /  2A ,  are  the  amplitudes  of  the  open  bands  in  the  Floquet 
spectrum  of  Figure  3(b).  The  radiation  spectrum  is  shown  as  a  solid  curve  on  the  right, 
while  the  solitons  are  displayed  on  the  left  as  vertical  arrows.  About  7%  of  the  wave 
energy  lies  in  the  soliton  part  of  the  spectrum.  It  is  useful  to  compare  the  amplitudes  of 
the  nonlinear  spectrum  in  Figure  4(a)  with  those  of  the  linear  spectrum  in  Figure  2(b). 
Note  that  the  radiation  components  in  the  scattering  transform  spectrum  are  smaller  than 
those  for  the  linear  Fourier  spectrum.  Physically  this  occurs  because  part  of  the  energy 
has  been  transferred  from  the  radiation  spectrum  to  the  soliton  spectrum,  due  to  the 
presence  of  nonlinear  effects,  by  thi*  inverse  scattering  transform. 

The  nonlinear  spectral  index  is  shown  in  Figure  4(b).  This  parameter  indicates  just 
how  nonlinear  the  spectral  components  are  at  a  particular  frequency  [Osborne  and 
Bergamasco,  1986].  Since  the  index  indicates  strong  nonlinear  behavior  for  values  near 
1,  two  frequency  ranges  are  of  interest  in  this  analysis.  The  first  is  at  low  frequency, 
signaling  the  presence  of  solitons  in  the  spectrum.  The  second  is  near  the  peak  of  the 
radiative  part  of  the  wave  train.  Nonlinear  interactions  are  quite  strong  in  these  two 
regions.  It  is  of  interest  to  explore  these  particular  cases  using  nonlinear  filtering,  as 
discussed  below. 

The  next  step  in  the  analysis  is  to  compute  the  hyperelliptic  functions  (nonlinear 
oscillation  modes)  of  the  data  by  base  point  iteration.  The  first  100  nonlinear  modes  are 
given  in  Figure  5(a).  The  horizontal  lines  separate  each  mode  from  its  neighbor  on  the 
vertical  scale,  which  has  units  of  squared  frequency  (these  are  the  units  of  the  horizontal 
coordinate  of  the  Floquet  diagram  in  Figure  3).  While  the  scale  of  the  nonlinear  modes  is 
rather  small  in  this  figure,  it  is  still  easily  seen  that  they  are  distinctly  non  sinusoidal, 
especially  near  the  larger  radiation  modes.  The  solitons  are  not  easily  ob.servable  at  the 
scale  of  this  figure,  but  these  will  be  graphed  below  in  such  a  way  as  to  render  them 
visible.  In  order  to  illustrate  1ST  and  its  associated  linear  superposition  law,  I  now  show 
how  the  linear  superposition  of  the  nonlinear  oscillation  modes  reconstructs  the  wave 
train  in  Figure  5(b).  I  have  summed  nonlinear  components  only  out  to  0.2  Hz  (the 
Nyquist  frequency  is  0.5  Hz),  but  a  comparison  of  Figure  5(b)  with  the  measured  wave 
train  Figure  2(a),  indicates  that  most  of  the  spectral  energy  has  been  included.  High 
frequencies  have  been  essentially  filtered  out  (above  0.2  Hz)  in  the  reconstruction  of 
Figure  5(b).  This  example  constitutes  my  first  application  of  the  concept  of  nonlinear 
filtering. 

I  now  consider  two  further  applications  of  filtering  using  the  nonlinear  oscillation 
modes.  The  first  is  with  regard  to  the  soliton  part  of  the  spectrum,  the  second  is  with 
regard  to  the  most  nonlinear  part  of  the  radiation  spectrum.  In  Figure  6 1  show  the 
hyperelliptic  modes  in  the  soliton  part  of  the  spectrum;  the  vertical  scale  has  been 
expanded  to  allow  easy  visualization  of  the  soliton  m-functions  (this  corrects  the  situation 
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Figure  5.  (a)  The  hyperclliplic  oscillation  modes  for  die  measured  wave  train  in  Figure  2(a).  The  latter  are 
computed  in  the  frequency  range  0.0-0.2  Hz.  The  linear  superposition  of  these  modes  gives  the  wave  train 
shown  in  (b),  which  results  by  low  pass  filtering  the  measured  wave  train.  This  is  the  first  example  of  a 
nonlinearly  fdtered  wave  train  using  the  periodic  inverse  scattering  transform. 
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soliton  train  is  shown  in  Figure  6(b),  where  the  original  wave  train  is  also  superposed  on 
the  figure.  As  noted  previously  [Osborne  et  al.,  1991]  the  soliton  contribution  to  the  wave 
train  consists  of  a  long,  low-amplitude  signal  lying  beneath  the  overlying,  narrow-banded 
wave  train,  which  is  dominated  by  the  radiation  modes.  Again  I  find  that  the  solitons  tend 
to  lie  beneath  the  maxima  of  the  local  wave  groups;  this  topic  is  discussed  in  detail 
elsewhere  [Osborne  and  Cavaleri,  1993].  It  is  impossible  to  stress  how  important  the 
nonlinear  filtering  process  is  to  the  understanding  of  the  soliton  dynamics;  I  know  of  no 
other  method  for  extracting  them  from  an  arbitrary  oceanic  wave  train  of  the  type  studied 
here. 

The  most  nonlinear  of  the  radiation  modes  have  also  been  filtered  from  the  measured 
wave  train.  These  results  are  shown  in  Figures  7  and  8.  Figure  7(a)  shows  the  hyper- 
elliptic  modes  centered  near  the  peak  of  the  spectrum,  where  the  nonlinear  spectral  index 
is  nearly  one,  in  the  frequency  range  0.094-0.108  Hz.  Figure  7(b)  gives  the  modes  over  a 
somewhat  larger  frequency  range  extending  from  0.094-0.122  Hz.  These  ranges  are 
indicated  on  the  nonlinear  spectral  index  graphed  in  Figure  4(b).  Scrutiny  of  the  nonlinear 
modes  in  Figure  7  reveals  that  they  are  clearly  not  sinusoidal  and  that  phase  locking  plays 
an  important  role  in  their  dynamics  (details  are  discussed  in  [Osborne  and  Cavaleri, 
1993]).  It  is  important  to  note  the  main  differences  in  the  nonlinear  filtering  process 
applied  in  the  present  paper  and  the  usual  one  for  linear  Fourier  analysis:  (1)  here  I  use 
the  spectral  index  to  select  the  most  nonlinear  parts  of  the  spectrum  to  study  and  (2)  the 
filtering  process  is  fully  nonlinear  and  often  requires  an  iterative  process  [Osborne,  1993]. 
The  regions  that  have  a  large  spectral  index  are  inverted  to  allow  reconstruction  and  study 
of  the  wave  trains  in  the  most  nonlinear  parts  of  the  1ST  spectrum;  linear  superposition  of 
these  modes  give  the  wave  trains  shown  in  Figure  8(a,  b).  These  wave  trains  are  highly 
nonlinear  and  are  not  generally  represented  by  the  linear  Fourier  transform.  For  reference 
I  also  show  the  soliton  part  of  the  wave  train,  superposed  on  the  nonlinear  radiation 
modes  in  Figures  8(a,  b).  Figure  8(b)  therefore  represents  the  most  nonlinear 
contributions  (as  seen  in  the  time  domain)  to  the  measured  wave  train  in  Figure  2(a). 

SUMMARY  AND  CONCLUSIONS 

It  is  worth  pointing  out  that  in  the  originally  measured  wave  train  (Figure  2(a)),  the 
soliton  components  are  obscured  by  the  radiation  modes,  e.g.  solitons  reside  in  the 
nonlinear  spectrum,  but  they  are  not  directly  visible  due  to  the  presence  of  the  energetic 
radiation  components.  The  soliton  dynamics  are  physically  significant,  but  not  directly 
visible  by  an  observer  of  the  measured  wave  train.  Nevertheless,  using  the  numerical 
methods  described  herein,  we  are  able  to  locate  the  solitons  and  to  explore  their 
dynamics.  This  constitutes  an  exercise  in  nonlinear  filtering.  Returning  to  the  spectrum  in 


Amplitude  (m)  Amplitude  (m) 


NONLINEAR  FOURIER  ANALYSIS  AND  FILTERING 


i: 


0  50  100  150  200  250  300  350  400  450  500 


Time  (sec) 


Figure  7.  Hyperelliptic  functions  in  the  most  nonlinear  part  of  the  radiation  spectrum  as  indicated  on 
Figure  4(b)  and  in  the  text  (a).  In  (b)  are  the  modes  for  an  expanded  region  of  the  radiation  specuum,  again 
defined  in  Figure  4(b). 
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Figure  8.  (a)  Sum  of  the  nonlinear  oscillation  modes  in  Figure  7(a).  (b)  The  sum  of  the  nonlinear  modes 
shown  in  Figure  7(b).  Both  of  these  are  examples  of  the  application  of  nonlinear  filtering  by  the  inverse 
scattering  transform  as  developed  in  this  paper. 
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Figure  4(a)  one  can  think  of  each  component  (as  a  function  of  frequency)  as  contributing 
to  the  nonlinear  Fourier  series  (28).  By  deleting  the  terms  corresponding  to  the  radiation 
modes,  and  then  summing  the  remaining  terms  for  the  soliton  pan  of  the  spectrum,  one 
obtains  only  the  contributions  that  the  solitons  make  to  the  measured  nonlinear  wave 
train.  One  finds  a  long,  low-amplitude  train,  consisting  of  five  nonlinearly  interacting 
solitons.  We  have  therefore,  using  the  numerical  inverse  scattering  transform  as  a  tool  for 
nonlinear  filtering,  found  the  solitons  hidden  in  a  sea  of  background  radiation.  An 
important  physical  result  is  that  the  solitons  tend  to  be  phase  locked  beneath  the  maxima 
of  the  wave  packets.  I  am  personally  convinced  that  this  fact  provides  an  important  clue 
to  the  eventual  understanding  of  the  behavior  of  nonlinear  wave  dynamics  in  the  Ursell 
number  regime  under  investigation.  Theoretical  understanding  of  these  results  is, 
however,  still  lacking. 

Another  result  of  the  application  of  nonlinear  filtering  to  the  analysis  of  the  Adriatic 
Sea  data  is  that  related  to  the  construction  of  the  nonlinear,  narrow-banded  wave  trains  in 
Figure  8.  Since  the  nonlinear  modes  are  clearly  not  sinusoidal  (see  Fig.  7),  the 
effect  of  nonlinear  interactions  amongst  these  closely  separated  components  is  evidently 
rather  important.  These  results  are  given  here  for  the  first  time  and  are  entirely  new.  A 
funher  surprising  result  is  that  the  nonlinear  spectral  index  can  be  near  1  for  the  radiation 
spectrum  as  well  as  for  the  solitons  (Fig.  4(b)).  Large  nonlinear  interactions  in  the 
radiation  components  is  evidently  another  new  result,  yet  to  be  fully  explored.  Complete 
understanding  of  the  influence  of  these  nonlinear  effects  on  the  physics  of  narrow-band 
wave  motions,  panicularly  with  regard  to  phase  locking,  is  a  topic  of  future  research. 
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ABSTRACT 

The  basic  method  of  principal  component  analysis  is  relatively  well  understood  by 
physical  oceanographers.  Some  less  generally  understood  ideas  involve  significance  test¬ 
ing  and  rotation  of  the  basis  functions.  Also,  a  number  of  other  analysis  techniques  re¬ 
lated  to  principal  component  analysis  can  be  more  easily  understood  by  using  it  as  a  start¬ 
ing  point  Such  techniques  include  factor  analysis,  extended  empirical  orthogonal  func¬ 
tion  analysis,  canonical  correlation  analysis,  and  complex  empirical  orthogonal  function 
analysis,  for  exanple. 

The  basic  calculations  comprising  principal  component  analysis  are  presented,  and 
significance  testing  and  rotation  are  discuss^  Pacific  sea  level  data  are  used  to  illustrate 
these  techniques.  The  ptper  concludes  with  a  discussion  of  various  extensions  to  the 
basic  techiuque  and  an  evaluation  of  the  usefulness  of  the  extensions. 


INTRODUCTION 

It  is  important  to  establish  at  the  outset  that  this  paper  is  not  intended  to  be  a  com¬ 
prehensive  review  of  principal  component  analysis  (PCA)  ot  its  applications.  It  is  similar¬ 
ly  not  intended  as  a  detailed  review  of  the  other  techniques  that  will  be  discussed  as  re¬ 
lated  to  PCA,  or  as  extensicms  of  it  Rather,  the  intention  of  this  piqwr  is  to  briefly  review 
the  PCA  technique  in  order  to  establish  a  cmrmxm  frame  of  reference  and  to  then  point 
out  how  several  commonly  used  techniques  can  be  viewed  as  tqjplications  of  PCA 

to  a  more  general  dataset  The  reason  for  doing  this  is  to  place  all  these  techniques  in  a 
sensible  fnuneworic,  to  point  out  where  they  o^^riap,  and  to  give  viewpdnts,  my  own 
and  others,  as  to  the  relative  merits  of  the  various  techniques.  I  have  not  avoided  giving 
the  results  of  my  own  experience  with  the  various  techniques,  but  I  have  tried  to  clearly 
identify  my  opinitms  in  order  that  the  reader  can  decide  what  shtnild  be  ignored. 

For  the  reader  interested  in  a  nxneconqaehensive  treatmentof  PCA,  and  of  factor 
analysis  (FA),  which  is  much  more  commonly  used  in  some  fields  other  than  oceanog- 
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raphy  and  meteorology,  several  books  are  recommended.  Preisendorfer  (1988)  provides 
an  extensive  bibliography  of  applications  and  source  material,  and  also  gives  additional 
detail  on  nearly  every^ing  discussed  in  this  paper.  The  book  by  Harman  (1976)  on  FA, 
while  somewhat  dated  and  primarily  written  from  the  point  of  view  of  woiiters  in 
psychology,  is  an  excellent  source  for  insight  about  rotation  methods.  A  book  written  for 
geologists  (Joreskog  et  al.,  1976)  is  also  well  done  and  appears  to  be  commonly  used  by 
oceanographers.  Finally,  a  very  recent  book  by  Reyment  and  Joreskog  (1993),  which  I 
have  not  yet  seen,  is  noteworthy  because  of  the  inclusion  of  an  appendix  that  includes  a 
set  of  electronically  available  routines  for  the  MATT-AB  programming  environment 

The  organization  of  the  paper  is  as  follows.  The  first  section  describes  the  basic  for¬ 
mulation  of  PCA.  This  section  includes  a  comparison,  due  to  Preisendorfer  (1988),  of  FA 
and  linear  regression  analysis  (LRA)  that  helps  to  illustrate  why  the  technique  is  so 
powerful  and  widely  useful.  This  section  continues  with  a  discussion  of  the  problem  of 
significance  testing  and  the  technique  of  rotation.  Both  of  these  latter  topics  should  be  un¬ 
derstood  by  any  user  of  PCA.  Throughout  this  section,  examples  are  given  using  a 
Pacific  monthly  mean  sea  level  anomaly  dataset.  All  of  the  discussion  in  this  section 
deals  with  the  PCA  of  a  scalar-valued  dataset  consisting  of  time  series  at  a  set  of  stations. 
The  following  section  treats  the  extension  of  PCA  to  the  analysis  of  vector-valued  data 
and  to  the  analysis  of  propagating  signals.  Finally,  I  will  examine  the  relationship  of  PCA 
to  canonical  correlation  analysis  (CCA),  which  is  used  for  the  simultaneous  analysis  of 
more  than  one  data  field. 


THE  BASICS  OF  PRINCIPAL  COMPONENT  ANALYSIS 
Basic  Computations 

We  will  consider  first  a  very  straightforward  application  of  PCA  to  a  set  of  time 
series  collected  at  a  set  of  stations.  As  an  example  of  this  I  will  use  sea  level  time  series 
collected  at  46  stations  in  the  Pacific  Ocean  (Figure  1).  The  temporal  means  are  removed 
from  the  time  series  at  each  station,  which  are  also  corrected  for  atmospheric  pressure 
and  the  mean  seasonal  cycle.  The  PCA  model  for  this  dataset  can  be  written 

K 

=  (1) 

where  s  is  the  station  index,  t  is  the  time,  and  K  is  equal  to  the  number  of  stations. 

The  variance-covariance  matrix  for  the  dataset  /t(5,r),  which  I  will  refer  to  simply  as 
the  variance  matrix,  is  a  special  case  of  what  Preisendorfer  (1988)  calls  the  scatter  matrix, 
and  is  written 
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Figure  1:  Pacific  sea  level  stations  used  in  the  PCA  example.  There  are  46  stations,  each  with  a  monthly 
mean  time  series  spanning  197S  to  1990.  The  monthly  mean  values  are  corrected  for  atmospheric  pressure 
and  the  mean  seasonal  cycle  before  the  PCA  is  performed.  Data  gaps  are  interpolated. 


=  (2) 

t 

Since  the  variance  matrix  is  real-valued  and  symmetric,  it  has  real  eigenvalues  and  eigen¬ 
vectors.  The  eigenvalues,  which  are  generally  sorted  into  decreasing  order,  give  the 
amount  of  variance  in  the  original  dataset  that  is  accounted  for  by  the  associated  eigen¬ 
vector  and  its  time  history  function.  The  eigenvectors  are  mutually  orthogonal  and  form 
the  basis  set  for  the  expansion  shown  in  Eq.  (1).  The  associated  time  history  functions, 
which  are  also  mutually  orthogonal,  are  computed  from  the  original  data  and  the  eigen¬ 
vectors  as 

K 

aiit)  =  (3) 

m=\ 


where  I  have  assumed  that  the  eigenvectors,  em(5),  are  normalized  to  unit  variance,  and 
the  time  history  functions,  a*(r),  are  allowed  to  carry  the  variance. 
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Figure  2  shows  the  results  of  applying  the  PCA  technique  to  the  Pacific  sea  level 
dataset.  The  eigenvectors  associate  with  the  two  largest  eigenvalues  are  interpreted  as 
space  maps,  and  the  analogous  time  history  functions  can  be  interpreted  as  modulating 
the  space  maps  and  indicating  when  that  particular  space  map’s  pattern  is  strongly  ex¬ 
pressed  in  the  original  dataset.  In  this  particular  case  the  two  functions  are  both  as¬ 
sociated  with  the  El  Niho/Southem  Oscillation  (ENSO)  events  in  the  tropical  Pacific.  The 
space  maps  show  this  clearly.  The  time  history  functions,  however,  are  somewhat  non¬ 
descript,  but  this  will  be  discussed  more  later. 

Relationship  to  Linear  Regression  Analysis  and  Factor  Analysis 

In  order  to  better  understand  what  the  PCA  expansion  defined  in  Eq.  (1)  does,  it  is  in¬ 
structive  to  compare  it  to  a  linear  regression  analysis  (LRA)  and  a  factor  analysis  (FA).  If 
we  truncate  the  PCA  and  FA  expansions  (criteria  for  doing  this  are  discussed  in  the  next 
section),  then  these  various  expansions  can  be  written 

M 

h(sj)  =  '^ak(t)et(s)  +  5(r,r)  PCA  (4a) 

m=l 

M 

h(,s,t)  =  X<P*(0P*(^)  +  e(^.f)  LRA  (4b) 

»ft=l 

M 

Hsj)  =  5y*(t)X*(j)  +  v(5,r)  FA  (4c) 

In  these  expansions,  ek,  P*,  and  Xk  are  thought  of  in  the  present  context  as  the  basis  func¬ 
tions;  Ok,  (pt,  and  fk  are  the  amplitude  functions;  and  5k,  Ek,  and  Vk  are  the  residual  noise 
terms. 

These  expansions  look  very  similar,  but  actually  there  are  quite  different  underlying 
assumptions.  For  the  LRA,  the  basis  functions  are  specified  a  priori,  and  the  time  history 
functions  are  Et,  typically  by  a  least  squares  criterion,  in  order  that  the  defined  basis  func¬ 
tions  account  for  the  maximum  amount  of  variance.  In  the  PCA  analysis,  the  data  are  al¬ 
lowed  to  choose  their  own  basis  set  under  the  criterion  that  each  function  must  explain 
the  maximum  variance,  subject  to  the  additional  constraint  that  each  succeeding  function 
be  orthogonal  to  all  the  preceding  ones.  The  truncated  PCA  expansion  is  therefore  maxi¬ 
mally  efficient  in  accounting  for  the  variance  of  the  original  dataset  with  the  fewest  basis 
functions,  but  there  is  a  cost  Namely,  there  is  no  guarantee  that  the  eigenfunctions  cor¬ 
respond  to  any  physically  meaiungful  modes  of  variability  of  the  original  data.  LRA,  on 
the  other  han^  can  be  constructed  using  basis  functions  that  are  deEned  from  a  priori 
knowledge  of  the  dynamics  of  the  system  being  studied. 
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Function  #1  37.4%  of  the  variance 
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Function  #2  22.8%  of  the  variance 
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Figure  2:  Results  of  PCA  of  Pacific  sea  level  dataset  Space  map  contour  units  are  1  cm,  and  negative  contours 
are  dashed.  The  space  map  values  must  be  multipled  by  the  time  function  plotted  below  it  in  order  to  obtain 
values  comparable  to  the  observations. 
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The  strengths  and  weaknesses  of  FA  as  compared  to  LRA  are  similar  to  those  of 
PCA.  However,  when  compared  to  PCA,  the  FA  case  is  more  subtle.  Basically,  while 
PCA  is  a  completely  objective  technique  that  needs  only  the  original  dataset  to  proceed, 
the  FA  treats  the  number  of  factors  used  in  the  expansion  and  the  residual  noise  term, 
Vi(5,r),  as  unknowns,  and  thus  is  an  underdetermined  system.  Many  suggestions  exist  for 
ways  to  close  the  system  (Harman,  1976),  but  the  technique  remains  somewhat  subjec¬ 
tive.  It  requires  specification  of  a  priori  information  that  is  usually  not  trivial  to  provide. 
Preisendorfer  (1988)  discusses  these  issues  at  some  length  and  claims  that  FA  is  the  "con¬ 
ceptually  deeper"  of  the  two  techniques.  I  have  never  been  able  to  convince  myself  that 
the  additional  subjectivity  associated  with  FA  provides  much  advantage  over  PCA. 

Significance  testing 

The  full  PCA  expansion  defined  by  Eq.  (1)  has  the  exact  same  information  content  as 
the  original  dataset  It  is  rare,  however,  that  the  original  data  are  free  of  noise,  and  one 
must  therefore  assume  that  many,  if  not  most,  of  the  PCA  functions  simply  represent 
noise.  The  question  naturally  arises,  then,  of  how  to  select  the  functions  that  may  repre¬ 
sent  signal,  in  order  that  they  can  be  further  analyzed.  Preisendorfer  (1988)  is  particularly 
good  on  this  topic  of  selection  rules  for  PCA,  and  the  interested  reader  should  consult 
that  text  on  this  topic.  Harman  (1976)  provides  an  interesting  historical  perspective  on 
the  development  of  older  selection  rules  that  have  largely  been  superseded. 

The  basic  idea  behind  all  of  the  selection  rules  is  quite  simple.  The  functions  are  com¬ 
pared  to  those  that  would  result  from  data  drawn  from  a  specific  noise  model,  and  those 
that  are  not  consistent  with  such  noise  data  are  deemed  worthy  of  further  study  as  signals 
with  possible  geophysical  significance.  The  selection  rules  discussed  by  Preisendorfer 
(1988)  fall  into  three  broad  categories:  variance  dominant,  time  history,  and  space  map 
rules.  NS.,  the  use  of  the  words  "time"  and  "space"  are  simply  convenient  and  do  not 
restrict  the  application  of  these  techniques  to  time  series  data  at  stations,  such  as  that  used 
in  my  Pacific  sea  level  example. 

The  variance  dominant  rules  are  probably  the  most  commonly  used  selection  rules  in 
oceanography  and  meteorology.  Recall  that  the  eigenvalues  of  the  scatter  matrix  are 
equal  to  the  amount  of  variance  of  the  original  dataset  accounted  for  by  the  associated 
eigenvector  (space  map),  and  amplitude  function  (time  history).  Figure  3  shows  the  46 
eigenvalues  obtained  for  the  Pacific  sea  level  dataset  after  placing  them  in  decreasing 
order.  Also  shown  on  this  frgure  are  the  eigenvalue  curves  obtained  by  the  application  of 
three  different  variance  dominant  selection  rules.  The  first,  labeled  Rule  N,  is  based  on 
the  assumption  that  the  original  time  series  are  simply  noise,  which  is  uncorrelated  from 
station  to  station.  To  apply  this  rule,  100  such  datasets  are  generated  and  the  eigenvalues 
are  computed  and  placMl  in  descending  order.  Then  the  95th  percentile  is  found  and 
plotted.  Eigenvalues  from  the  actual  dataset  that  fall  above  this  curve  are  deemed  as  un¬ 
likely  to  result  from  a  dataset  consisting  of  only  noise.  In  the  case  of  the  Pacifre  sea  level 
dataset,  only  the  first  two  functions  are  thus  selected. 
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Figure  3:  The  46  eigenvalues  from  the  PCA  of  die  Pacific  sea  level  dataset  are  shown  as  open  circles.  Also 
shown  are  the  results  of  applying  three  difierence  selection  rules  (see  text  fOT  details). 


One  complication  worth  considering  in  the  application  of  Rule  N  to  geophysical  data 
is  the  fact  that  most  time  series  of  such  data  are  not  well-modeled  by  a  white  noise 
model,  but  have  significant  serial  correlation  due  to  oversampling.  One  method  that  I 
have  used  to  deal  with  this  problem  is  to  define  a  noise  model  that  consists  of  "red" 
noise.  I  characterize  the  model  according  m  the  approximate  spectral  slope  obtained  from 
a  Fourier  analysis  of  the  noise  series.  Figure  4  shows  the  result  of  applying  Rule  N  to  the 
Pacific  sea  level  dataset  using  several  values  for  die  spectral  slope.  Clearly,  quite  dif¬ 
ferent  conclusions  about  the  significance  of  the  low  order  functions  would  be  reached 
depending  on  which  rtKxlel  is  chosen.  An  noise  model  is  appropriate  for  the  Pacific 
sea  level  dataset,  and  this  was  in  fact  the  noise  model  used  in  generating  the  Rule  N 
curve  in  Figure  3. 

Variance  dominant  criteria  otho*  than  Rule  N  are  discussed  by  Preisendmfer  (1988), 
but  Rule  N  appears  to  be  the  most  widely  used  selection  rule  of  this  class.  Two  oth^ 
selection  rules  are  shown  on  Figure  3  that  are  of  historical  interest.  The  scree  test  is  a  sub¬ 
jective  test  that  requires  the  analyst  to  select  the  point  on  the  eigenvalue  curve  where  the 
curve  begins  to  tend  up  more  sharply  at  lower  orders.  This  is  an  experientially  based  test 
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X  10*  Rule  N  retultt  from  several  noise  models 


Figure  4:  Rule  N  q^plied  to  Pacific  sea  level  data  using  various  models  for  the  noise. 


The  second  test,  labeled  the  "Guttman"  test,  is  objective.  This  test  computes  the  average 
eigenvalue  and  defines  any  that  are  larger  than  this  value  to  be  of  possible  interest.  The 
rationale  here  is  simply  that  the  selected  functions  explain  nx>re  than  an  average  amount 
of  variance.  These  latter  two  tests  tend  to  be  less  conservative  than  Rule  N,  at  least  when 
the  95th  percentile  is  used  in  that  test  These  less  conservative  tests  have  some  value  in 
the  rotation  problem  that  is  discussed  in  the  next  section. 

Time  hi  stray  and  space  map  rules  are  less  commonly  used  than  variance  dmninant 
rules,  but  this  is  probably  due  to  the  simplicity  of  the  the  latter  rather  than  to  any  inherent 
advantage  in  them.  The  time  history  rules  wtnic  by  exanoining  the  time  history  function 
and  testing  it  for  low  frequency  variability.  Scvei^  ways  of  doing  this  are  described  by 
Preisendoifer  (1988).  The  space  nu^  rules  are  similar,  but  work  on  the  eigenfunctions 
(space  maps)  and  look  for  coherent  "spatial"  patterns.  These  rules  should  probably  be 
more  widely  used,  particul^y  since  they  should  be  able  to  detect  functitHis  that  map  to  a 
geophysical  signal  that  does  not  draninate  the  variance  of  the  dataset  but  can  be  identified 
by  coherent  temporal  or  spatial  patterns.  An  example  is  seen  in  Barnett  (1977). 
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Another  area  of  concern  for  significance  testing  is  that  of  placing  confktence  limits 
on  the  time  series  and  space  maps  themselves.  For  example,  how  large  do  signals  in  the 
time  series  or  space  maps  shown  in  Figure  2  have  to  be  before  there  is  reasonable  con¬ 
fidence  that  they  represent  real  modes  of  variation  in  the  actual  data?  Unfcxtunately,  the 
variance  dominant  rules  used  to  select  the  functions  cannot  answer  such  questions.  To  the 
best  of  my  knowledge,  this  is  an  unsolved  problem,  although  there  has  been  some 
progress  in  this  area  using  bootstrap  techniques  (D.  Chelttm,  pers.  comm.). 

Rotation 

If  one  thinks  of  the  eigenvectors  as  a  set  of  orthogonal  basis  vectors  for  expanding 
the  original  dataset,  then  it  is  easy  to  imagine  getnnetrically  "rotating"  this  basis  set 
Another  way  to  view  this  rotation  is  to  imagine  replacing  the  original  set  of  basis  vectors 
with  a  set  that  consists  of  linear  combinations  of  the  original  set  In  the  rotated  frame,  the 
basis  vectors  can  still  be  orthogonal,  but  the  amplitude  functions  will  now  be  correlated. 

But  why  would  one  want  to  rotate  a  perfectly  good  set  of  basis  functions  anyway?  If 
the  primary  purpose  of  the  PCA  is  to  compress  the  original  dataset  into  a  few  functions 
that  still  capture  a  large  portion  of  the  variability  of  the  data,  then  there  is  no  need  for 
rotation.  The  PCA  frame  is,  by  constmction,  the  most  efficient  description  of  the 
variance  possible.  If  one  desires  to  interpret  the  individual  functions  in  physical  terms, 
however,  then  this  efficiency  can  be  a  problem. 

Imagine  that  the  original  data  consist  of  a  number  of  distinct,  but  not  necessarily  or¬ 
thogonal  and  unrelated,  OKxles  of  variation.  In  order  to  explain  the  most  variance  pos¬ 
sible,  the  PCA  technique  will  return  linear  combinations  of  these  modes  -  not  the  modes 
themselves.  The  hope  of  the  rotation  technique  is  that  by  relaxing  the  requirement  that 
the  basis  functions  are  maximally  efficient  at  explaining  variance,  then  it  may  be  possible 
to  obtain  modes  that  more  closely  resemble  the  natural  modes  of  the  dataset  In  fact,  it  is 
often  claimed  that  non-rotated  P^  frames  should  not  be  interpreted  physically  at  all. 

Harman  (1976)  is  the  best  reference  I  have  found  for  discussion  of  rotations,  although 
it  is  written  in  the  context  of  FA.  There  are  a  large  number  of  rotatitm  techniques,  which 
can  be  separated  into  orthogonal  and  oblique  rotations.  Both  sets  of  rotations  relax  the  re¬ 
quirement  that  the  functions  be  maximally  efficient  at  explaining  variance,  but  or¬ 
thogonal  rotations  preserve  the  orthogonality  between  the  basis  vectors  while  oblique 
rotations  do  not  require  even  this.  Orthogonal  rotations  in  general,  and  varimax  specifical¬ 
ly,  are  most  ccnnmon  and  are  much  simpler  to  perform,  and  also  to  interpret,  than  oblique 
rotations.  My  experience  has  been  that  rotation  should  always  be  done  before  attempting 
to  interpret  the  functions,  but  that  little  is  gained  by  going  beyond  simple  varimax  rota¬ 
tion.  A  contrary  opinion  and  example  is  given  by  Richman  (1981). 

Orthogonal  rotations  in  general  work  by  searching  for  a  rotated  frame  that  minimizes 
the  number  of  basis  functions  that  any  particular  tinne  series  in  the  original  dataset 
projects  to.  To  say  this  another  way,  the  technique  seeks  a  rotated  frame  where  any 
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station’s  time  series  from  the  original  time  series  projects  to  the  new  basis  functions  in 
such  a  way  that  the  projections  are  near  0  or  1.  Preisendorfer  (1988;  pg.  271-274)  gives 
an  excellent  graphical  explanation  of  why  this  criterion  is  appropriate;  the  piupose  here  is 
simply  to  document  how  the  basic  algorithm  operates.  In  doing  the  optimization,  there  is 
a  penalty  function  that  increases  when  a  projection  is  "far"  from  0  or  1,  and  the  exact 
form  of  this  penalty  function  defines  numerous  rotation  schemes,  of  which  varimax  is 
probably  the  most  common.  A  review  of  many  others  is  given  by  Harman  (1976). 

Figure  5  shows  the  result  of  applying  a  varimax  rotation  to  the  first  10  functions  from 
the  PCA  of  the  Pacific  sea  level  dataset.  The  reason  for  using  10  functions  is  that  results 
for  low  order  rotated  functions  ate  unstable  if  too  few  functions  are  rotated,  but  are  rela¬ 
tively  insensitive  to  adding  in  a  few  extra  functions  that  represent  only  noise.  This  con¬ 
clusion  is  stated  by  Harman  (1976),  and  my  experience  bears  it  out.  "^e  scree  and  Gut- 
tman  tests  shown  in  Figure  2  are  often  useful  indicators  of  the  maximum  number  of  pos¬ 
sibly  interesting  functions,  and  I  have  found  them  useful  as  a  guide  to  choosing  the  num¬ 
ber  of  functions  to  include  in  the  rotation  procedure. 

The  maps  and  time  series  shown  in  Figure  S  account  for  almost  exactly  the  same 
amount  of  variance  (50%)  from  the  total  dataset  as  the  first  two  unrotated  functions. 
These  functions  are  much  simpler  to  interpret,  however.  Examination  of  the  maps  and 
time  series  shows  that  the  first  and  second  functions,  respectively,  map  to  a  western 
Pacific  ENSO  response  that  is  primarily  north  and  south  of  die  equator.  The  associated 
time  series  show  that  the  northern  pattern  occurs  in  the  late  part  of  the  calendar  year, 
while  the  southern  pattern  is  associated  with  a  timing  that  is  several  months  later.  Par¬ 
ticularly  interesting  is  the  fact  that  the  various  events  in  the  records  map  on  to  these  two 
modes  differently;  only  the  1982-83  event  shows  a  strong  expression  of  both  types  of 
events.  Thus,  it  seems  that  the  ENSO  events  tend  to  be  one  of  the  two  types:  a  "northern" 
type  that  sees  mass  lost  primarily  from  north  of  the  equator  in  the  western  Pacifrc  late  in 
the  calendar  year,  or  a  "southern"  type  where  the  mass  comes  from  south  of  the  equator 
during  the  early  part  of  the  calendar  year.  Interpreting  these  two  signals  as  simply  the 
beginning  and  ending  stages  of  the  same  event  is  not  quite  satisfying,  since  in  that  view 
most  events  either  do  not  have  a  beginrung  or  do  not  have  an  end. 


EXTENSIONS  TO  PRINCIPAL  COMPONENT  ANALYSIS 
Vector  data 

As  developed  in  the  preceding  section,  PCA  works  fen*  scalar  data  via  the  variance,  or 
scatter,  matrix.  In  fact,  this  restriction  to  scalar  data  is  unnecessary,  and  vector  data  can 
be  treated  as  long  as  an  appropriate  scatter  matrix  can  be  defined.  An  appropriate  scatter 
matrix  is  one  that  properly  represents  the  variability  characteristics  of  die  dataset  being 
studied,  and  one  that  has  a  full  set  of  eigenvalues  and  eigenvectors.  Note  that  there  can  be 


Figure  S:  As  in  Figure  2,  but  for  the  rotated  functions. 
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eigenvalues  with  multiplicity  greater  than  one,  although  this  does  not  generally  happen  in 
datasets  containing  noise. 

An  early  application  of  PCA  to  a  vector-valued  field  was  made  by  Barnett  (1977), 
who  analyzed  monthly  mean  surface  wind  v«:tOTS  over  the  Pacific  Ocean.  His  analysis 
was  not  truly  a  vector  analysis,  however,  in  the  sense  that  he  separately  analyzed  the 
zonal  and  meridional  wind  components  without  taking  the  vector  nature  of  the  data  into 
account  Legler  (1983)  treated  basically  the  same  dataset  in  a  truly  vectorial  way  by  defin¬ 
ing  the  wind  vectors  as  a  complex  quantity  and  then  defining  the  scatter  matrix  by  using 
complex  conjugates  to  form  the  variance  matrix. 

Preisendorfer  (1988)  points  out  that  the  vector  analysis  can  also  be  done  by  simply 
defining  each  of  the  wind  components  at  each  station  as  a  separate  variable  in  a  normal 
(scalar)  PCA.  In  this  case,  if  there  are  M  stations  with  N  time  points,  then  the  scatter 
matrix  is  2M  by  2M,  and  there  are  2M  eigenvectors,  each  of  which  has  a  real-valued  N- 
point  time  history  function  associated  wiA  it.  He  goes  further  to  argue  that  this  analysis 
is  not  equivalent  to  the  complex  method,  but  contains  some  additional  information  on  the 
complex  phase  that  is  lost  in  forming  the  scatter  matrix  with  complex  conjugation.  In 
fact,  I  find  that  this  technique  is  also  simpler  in  practice  than  the  complex  one,  in  that  the 
same  routines  developed  for  scalar  analysis  apply  to  this  problem  as  well. 

Propagating  signals 

One  problem  with  the  basic  PCA  of  space-time  data  is  that  the  resulting  functions  do 
not  properiy  represent  propagating  signals.  This  is  because  the  dataset  is  described  by  a 
set  of  space  maps  modulated  by  separate  time  series.  This  is  a  serious  drawback  in 
oceanography  and  meteorology  where  propagating  signals  are  usually  present,  and  are 
often  the  features  of  primary  interest  to  the  analyst.  A  number  of  techniques  have  been 
developed  that  extend  the  basic  PCA  to  the  case  of  signals  that  cannot  be  represented  by 
separable  functions  of  space  and  time. 

An  early  development  was  the  method  usually  referred  to  as  frequency  domain  em¬ 
pirical  orthogonal  functions  (FDEOFs).  The  basic  references  for  this  technique  are  Wal¬ 
lace  and  Dickinson  (1972)  and  Wallace  (1972).  Basically,  this  procedure  starts  by  trans¬ 
forming  the  time  series  at  each  point  in  space  into  the  fiequency  domain.  The  resulting 
complex  spectrum  are  averaged  over  a  frequency  band  of  interest,  and  the  resulting  space 
map  of  complex  numbers  is  analyzed  via  a  con^lex  form  of  PCA.  The  result  of  this 
analysis  is  a  map  of  amplitude  and  phase  that  can  be  analyzed  for  phase  propagation  sig¬ 
natures.  This  method  has  not  been  widely  used  and  in  my  experience  is  not  overly  suc¬ 
cessful  at  identifying  signals  that  are  not  readily  apparent  in  the  original  data. 

An  improved  technique,  referred  to  as  complex  empirical  orthogonal  functions 
(CEOFs),  was  described  by  Barnett  (1983).  The  results  of  this  analysis  are  somewhat 
similar  to  the  output  of  FDEOFs,  but  the  calculations  are  simplified  by  the  use  of  a  Hil¬ 
bert  transform  on  the  original  time  series,  which  builds  in  the  phase  information  neces- 
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sary  to  identify  propagating  signals.  The  CEOF  technique  is  more  general  than  the 
FDEOFs,  and  should  be  superior  to  that  method  at  identifying  features  that  are  distinctly 
non-sinusoidal  in  nature.  Although  computationally  simpler  than  FDEOFs,  I  have  found 
in  my  own  work  that  the  CEOFs  are  similarly  difficult  to  interpret  in  most  applications. 

Probably  the  most  widely  used  technique  for  describing  propagating  signals  is  the  ex¬ 
tended  empirical  orthogonal  function  method  (Barnett  and  Hasselmann,  1979;  Weare  and 
Nasstrom,  1982).  This  technique  builds  in  the  phase  information  by  "extending"  the 
analysis  to  include  not  only  the  original  dataset,  but  also  the  same  dataset  at  a  variety  of 
temporal  lags.  These  lagged  time  series  are  simply  input  as  additional  variables,  and  the 
normal  machinery  for  basic  PCA  therefore  applies.  With  the  extended  dataset  it  is  pos¬ 
sible  to  identify  patterns  at  one  time  that  have  high  correlations  with  patterns  at  a  later 
time.  A  good  example  of  the  application  of  EEOFs  is  given  by  White  and  Tai  (1992). 
One  advantage  of  this  method  is  that  the  signals  can  deform  in  space  and  time  in  fairly 
general  ways  without  being  lost  to  the  techiuque.  I  have  found  this  technique  to  be  very 
useful  in  a  number  of  different  contexts. 

Another  technique  that  can  identify  propagating  disturbances  is  the  principal  oscilla¬ 
tion  pattern  analysis.  I  will  not  discuss  this  technique  but  will  refer  the  interested  reader 
to  the  paper  by  von  Storch  in  this  same  volume. 

Canonical  Correlation  Analysis 

In  all  of  the  discussion  preceding  I  have  dealt  only  with  datasets  consisting  of  one 
data  type;  e.g.,  sea  level  or  wind  vectors.  In  fact,  if  the  data  are  appropriately  non-dimen- 
sionalized,  then  there  is  nothing  to  prevent  data  with  different  units  from  being  included 
in  the  PCA.  This  procedure  is  often  useful,  but  only  identifies  the  major  modes  of 
variability  of  the  datasets.  It  does  not  identify  the  patterns  of  variability  in  the  different 
datasets  that  are  related,  or  co-varying.  There  is,  however,  a  technique  related  to  PCA 
that  looks  for  these  types  of  relationsUps;  this  technique  is  called  canonical  correlation 
analysis  (CCA),  and  in  the  past  few  years  it  is  being  more  widely  used  in  oceanography. 

Preisendorfer  (1988)  shows  how  the  PCA  description  of  two  different  datasets  can  be 
used  to  derive  CCA,  although  the  original  derivation  of  CCA,  which  he  attributes  to 
Hotelling  (1936),  did  not  actually  make  use  of  this  machinery.  To  drastically  oversimply, 
the  time  history  functions  from  Ae  PCA  for  each  of  the  datasets  can  be  used  to  form  a 
correlation  matrix.  This  matrix  is  then  used  to  form  an  eigenvalue  problem  that  leads  to 
the  canonical  correlation  functions.  The  first  of  these  functions  can  be  interpreted  as  the 
pattern  in  one  dataset  that  is  tnaxunally  correlated  with  the  corresponding  pattern  in  the 
other  dataset  Then  the  second  function  reveals  the  patterns  that  give  the  highest  correla¬ 
tion  between  the  datasets  after  removing  the  correlation  due  to  the  first  canonical  corre¬ 
late,  and  so  on.  As  with  BCA,  there  are  selection  rules  to  be  applied  to  determine  whether 
the  CCA  results  could  arise  from  data  consisting  simply  of  noise. 
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Aside  from  the  fact  that  CCA  can  be  derived  via  PCA,  it  is  considered  a  related  tech¬ 
nique  because  it  can  be  viewed  as  a  natural  extension  of  PCA.  PCA  describes  the 
variance  structure  of  a  dataset,  or  datasets,  while  CCA  describes  the  covariance  between 
two  datasets.  Some  useful  examples  of  oceanographic  and  meteorological  applications  of 
this  technique  are  given  by  Barnett  and  Preisendorfer  (1987)  and  (jraham  et  al.  (1987). 

CONCLUSIONS 

The  basic  calculations  involved  in  PCA,  which  are  probably  familar  to  most  oceanog¬ 
raphers,  were  reviewed.  Methods  for  testing  the  significance  of  the  PCA  functions  were 
discussed,  and  it  was  suggested  that,  in  addition  to  the  common  use  of  variance  dominant 
selection  rules  (e.g..  Rule  N),  more  use  should  probably  be  made  of  time  history  and 
space  map  selection  rules.  Also,  the  technique  of  factor  rotation  was  discussed  briefly, 
with  the  conclusion  that  rotation  should  be  an  inqxntant  part  of  any  attempt  to  physically 
interpret  the  results  of  a  PCA.  Orthogonal  rotations,  such  as  varimax,  are  likely  sufficient 
for  most  applications. 

Extensions  of  the  basic  PCA  technique  were  discussed  that  allow  the  analysis  of  vec¬ 
tor-valued  datasets,  as  well  as  datasets  containing  signals  that  propagate  in  space-time. 
My  experience  is  that  the  most  general  procedure  for  dealing  with  vector  data,  described 
by  Preisendorfer  (1^'S8),  is  also  the  simplest  to  apply.  Similarly,  for  the  analysis  of 
propagating  sign^' j,  the  EEOF  method  is  also  the  simplest  to  use  and  manages  to  per¬ 
form  at  least  as  well  as  the  more  complicated  frequency  domain  and  complex  techniques. 
Finally,  it  was  pointed  out  that  CCA,  which  identifies  patterns  of  covariance  between  dif¬ 
ferent  datasets,  is  a  natural  extension  of  PCA  that  is  gradually  finding  more  widespread 
use  in  oceanography. 
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ABSTRACT 

In  the  present  paper  the  concept  of  the  principal  oscillation  pattern  (POP)  analysis  is 
reviewed.  This  technique  is  used  to  simultaneously  infer  the  characteristic  patterns  and 
time  scales  of  a  vector  time  series.  The  POPs  may  be  seen  as  the  normal  modes  of  a 
linearized  system  whose  system  matrix  is  estimated  from  data.  As  a  demonstration,  the 
POP  technique  is  used  for  the  analysis  of  the  intraseasonal  variability  in  the  equatorial 
Pacific  Ocean;  first  results  are  presented.  Daily  observations  of  temperature  and  currents 
in  the  upper  SOO  m  of  the  equatorial  Pacific,  recorded  by  moored  buoys,  are  analyzed  with 
respect  to  intraseasonal  (40-180  day  band)  variations.  Two  oscillatory  highly  coherent 
modes  are  found  with  periods  between  65  and  120  days.  Both  modes  propagate  eastward 
along  the  equator.  The  modes  are  clearly  reflected  in  both  the  zonal  currents  and  the 
temperatures,  which  trail  behind  the  zonal  currents  by  45°.  In  the  slower  of  the  two 
modes,  the  temperature  signal  propagates  more  slowly  than  the  zonal  current  signal,  and 
no  signal  occurs  in  the  meridional  current.  The  mode’s  activity  is  enhanced  during  warm 
events  of  the  Southern  Oscillation.  In  the  faster  mode  a  signal  also  appears  in  the 
meridional  current.  Its  amplitude  exhibits  an  annual  cycle,  with  variance  on  the  annual  and 
on  the  semiannual  period.  The  slower  mode  might  be  an  equatorial  Kelvin  wave  but  the 
faster  mode,  which  has  a  significant  meridional  current  component,  is  inconsistent  with  the 
concept  of  an  equatorial  Kelvin  wave. 

1.  INTRODUCTION 

Principal  oscillation  pattern  analysis.  In  the  present  paper  the  principal  oscillation 
pattern  (POP)  technique  is  reviewed  (Section  2)  and  its  usefulness  is  demonstrated  by  an 
analysis  of  the  intraseasonal  variability  in  the  equatorial  Pacific  (Section  3).  The  POP 
analysis  is  a  multivariate  technique  to  empirically  infer  the  characteristics  of  the  space-time 
variations  of  a  complex  system  in  a  high-dimensional  space  (Hasselmann,  1988;  von 
Storch  et  al.,  1988).  The  basic  ansatz  is  to  identify  a  low-order  system  with  a  few  free 
parameters  fitted  to  the  data.  Then,  the  space-time  characteristics  of  the  low-order  system 
are  regarded  as  being  the  same  as  those  of  the  full  system. 
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Applications  of  POP  analysis.  The  POP  analysis  is  now  a  routinely  used  tool^  to 
diagnose  the  space-time  variability  of  the  climate  system.  Processes  analysed  with  POPs 
are 

•  The  low-frequency  variability  of  the  thermohaline  circulation  in  the  global  ocean 
(Mikolajewicz  and  Maier-Reimer,  1991;  Weisse  et  al.,  in  press), 

•  The  low-frequency  variability  in  the  coupled  atmosphere-ocean  system  (Xu,  1993), 

•  The  El  Nifio  /  Southern  Oscillation  ENSO  (Xu  and  von  Storch,  1990;  Xu,  1990; 
Blumenthal,  1991;  Latif  and  Villwock,  1989;  Latif  and  Flugel,1990;  Burger,  1993;  Xu, 
1992;Latifetal.,  1993), 

•  The  Madden  and  Julian  Oscillation  (MJO),  also  named  the  tropical  30-  to  60-day 
oscillation  (von  Storch  et  al.,  1988;  von  Storch  and  Xu,  1990;  von  Storch  and 
Baumhefher,  1991;  and  von  Storch  and  Smallegange,1991), 

•  The  stratospheric  Quasi-Biennial  Oscillation  (Xu,  1 992), 

•  Tropospheric  baroclinic  waves  (Schnur  et  al.,  1993). 


Generalizations  of  the  POP  analysis.  There  is  a  series  of  generalizations  of  the  basic 
POP  approach  which  we  will  not  detail  in  the  present  paper.  The  predictive  potential  of 
the  POP  method  has  been  tested  with  the  Southern  Oscillation  (Xu  and  von  Storch,  1990) 
and  with  the  Madden  and  Julian  Oscillation  (von  Storch  and  Xu,  1991).  In  the  cyclo¬ 
stationary  POP  analysis,  the  estimated  system  matrix  is  allowed  to  vary  deterministically 
with  an  externally  forced  cycle  (Blumenthal,  1991).  In  the  complex  POP  analysis  not  only 
the  state  of  the  system  but  also  its  “momentum”  is  modeled  (Burger,  1993). 

Organization.  In  Section  2,  the  POPs  are  introduced  in  two  conceptually  different  ways. 
One  way  is  to  define  POPs  as  normal  modes  of  a  linear  system  in  which  parameters  are 
inferred  from  a  vctor  time  series.  The  other  way  is  to  regard  POPs  as  a  simplified  version 
of  principal  interaction  patterns  (PIPs).  The  PIP  ansatz  (Hasselmann,  1988)  is  a  fairly 
general  approach  which  allows  for  a  large  variety  of  complex  scenarios.  In  Section  3  a 
POP  analysis  of  daily  hydrographic  reports  (temperature,  zonal  and  meridional  currents,  as 
well  as  surface  wind)  from  moored  buoys  in  the  tropical  Pacific  Ocean  is  presented.  Two 
eastward  propagating  modes,  both  similar  to  the  mode  described  by  Johnson  and 
McPhaden  (1993),  are  identified  and  their  spatial  signatures  are  described.  The  paper  is 
concluded  in  Section  4  with  some  remarks  on  the  general  merits  and  limitations  of  the 
POP  technique. 


1 A  FORTRAN  code  with  a  manual  (Gallagher  et  al.,  1991)  of  the  regular  POP  analysis  is  free  of  charge 
available  at  the  Eieutsches  Klimarechenzentrum,  Bundesstrassse  SS,  2000  Hamburg  13,  Germany. 
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2  PRINCIPAL  OSCILLATION  PATTERNS 

The  following  notations  are  used:  Vectors  are  given  as  bold  letters  and  matrices  as 
calligraphic  letters  like  or  X.  If  is  a  matrix  then  is  the  transposed  matrix.  If  x  is 
any  complex  quantity  then  x*  is  its  conjugate  complex.  It  should  be  noted  that  the  POP 
formalism- conventional,  cyclostationary,  and  complex  POP  analysis-may  be  applied  to 
linear  systems  whose  system  matrices  are  estimated  from  data  or  whose  system  matrices 
are  derived  from  theoretical  dynamical  considerations  (Schnur  et  al.,  1993). 

2.1  POPs  and  Normal  Modes 

Normal  modes.  The  normal  modes  of  a  linear  discretized  real  system 

x(r  +  l)=J2l.x(/)  (1) 

are  the  eigenvectors  p  of  the  matrix  JA.  In  general,  JA  is  not  symmetric  and  some  or  all  of 
its  eigenvalues  X  and  eigenvectors  p  are  complex.  However,  since  is  a  real  matrix,  the 
conjugate  complex  quantities  X*  and  p*  satisfy  also  the  eigen-equation  jA  p’  =  X'p*.  In 
most  cases,  all  eigenvalues  are  different  and  the  eigenvectors  form  a  linear  basis.  So  each 
state  X  may  be  uniquely  expressed  in  terms  of  the  eigenvectors 


The  coefficients  of  the  pairs  of  conjugate  complex  eigenvectors  are  conjugate  complex, 
too.  Inserting  (2)  into  (1)  we  find  that  the  coupled  system  (1)  becomes  uncoupled,  yielding 
n  single  equations,  where  n  is  the  dimension  of  the  process  x, 

z(/+l)p  =  Xz(0p  (3) 


so  that  if  r(0)  =  1 


z(r)p  =  X'p. 


(4) 


The  contribution  P(/)  of  the  complex  conjugate  pair  p,  p*  to  the  process  x(0  is  given  by 

P(r)  =  z(r)p-«-U(r)pr.  (5) 

Writing  p  =  p*  +  i  •  p2  and  2z(t)  =  z‘(t)-i  z^(0,  this  reads 

P(r)  =  z‘(r)-p‘+z'(r)p*  (6) 

=  p'  •  (cos(  Tjt)  •  p'  -  sin(  T]r )  •  p^ ) 
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with  A  =  p  exp(-iTj)  and  if  z(0)  =  1.  The  geometric  and  physical  meaning  of  (6)  is  that 
between  the  spatial  patterns  p>  and  the  trajectory  P(t)  performs  a  spiral  (Figure  1)  with 
period  r=  2ji/(d  and  e-folding  time  t  =  -1/ln  (p),  in  the  consecutive  order 

•-4p'-^-p'-4-p'->p^^p'->...  (7) 


Figure  1.  Typical  evolution  of  a  POP  signal,  given  by  Eq.  (6),  if  z>(0)  =  0  and  z^(0)  =  1.  In  this 
demonstration  the  period  is  T  «  9  and  the  e-folding  time  is  x  ==  2.8. 

The  e-folding  time.  The  e-folding  time  has  to  be  considered  with  some  caution.  It 
represents  formally  the  average  time  for  an  amplitude  of  strength  one  to  reduce  to  1/e.  But 
in  the  POP  context  this  time  is  a  statistic  of  the  entire  time  interval,  i.e.,  it  is  derived  not 
only  from  the  episodes  when  the  signal  is  active  but  also  from  those  times  when  the  signal 
is  weak  or  even  absent.  As  such,  the  mode  will  be  dampened  less  quickly  as  indicated  by 
the  e-folding  time  when  the  mode  is  active.  The  other  limitation  refers  to  the  presence  or 
absence  of  high-frequency  variations.  If  these  are  filtered  out,  as  in  Section  3,  the  e-folding 
time  is  lengthened. 

Representation  of  normal  modes.  The  modes  may  be  represented  either  by  the  two 
patterns  p*  and  p^.  or  by  plots  of  the  local  wave  amplitude  A^(r)  =  [pK**)]^  +  [P^(t)]^  and 
relative  phase  i//(r)  =  tan**[p2(r)/pi(r)]  (Figure  2).  The  transformation  (7)  between  the 
patterns  p^  and  p^  can  assume  various  geometric  wave  forms.  If  p^(r) »  p'(r  -ro)  with  a 
location  vector  r  and  a  fixed  vector  rg,  the  signal  appears  as  a  parallel  crested  wave  of 
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wavelength  4ro,  propagating  in  the  rQ-direction  (Figure  2a).  In  Figure  2b  an  amphidromal 
(rotational)  wave  is  shown. 


160* 

Figure  2.  Examples  of  (a)  a  propagating  wave  and  (b)  an  amphidromal  wave  and  their  representation  in 
terms  of  POPs.  Top  two  panels:  representation  by  pi  and  p2.  Bottom  panel:  representation  by  phase  y 
(dashed)  and  amplitude  A  (solid).  From  von  Storch  et  al.  (1988). 

Time  coefficients.  The  pattern  coefficients  Zj  are  given  as  the  dot  product  of  x  with  the 
adjoint  patterns  p^,  which  are  the  normalize  eigenvectors  of  j\T\ 

=  (8) 

k 

POPs.  All  information  used  so  far  is  the  existence  of  a  linear  equation  Eq.  (1)  with  some 
matrix  J4..  No  assumption  was  made  about  the  origin  of  this  matrix.  In  dynamical  theory, 
the  origins  of  Eq.  (1)  are  linearized  and  discretized  differential  equations.  In  case  of  the 
POP  analysis,  the  relationship 


x(r + 1)  =  J4.  xfr) + noise 


(9) 
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is  hypothesized.  Multiplication  of  Eq.  (9)  from  the  right  hand  side  by  the  transposed  x^(t) 
and  taking  expectations,  E,  leads  to 

J^=£[x(/  +  l)x''(r)][£[x(/)x’' (/)]]■•.  (10) 

The  eigenvectors  of  Eq.  (10)  or  the  normal  modes  of  Eq.  (9)  are  called  principal 
oscillation  patterns.  The  coefficients  z  are  called  POP  coefficients.  Their  time  evolution  is 
given  by  Eq.  (3),  superimposed  by  noise 

2(/+1)  =  Az(/)+ noise.  (11) 

The  stationarity  of  Eq  .  (1 1)  requires  p  <  1 .  In  practical  situations,  when  only  a  finite  time 
series  x(t)  is  available,  is  estimated  by  first  deriving  the  sample  lag-1  covariance  matrix 
X  j  =  ]^,x(t  +  l)x^  (t)  and  the  sample  covariance  matrix  Xq  =  ^\{t)x^  (t)  and  then 

forming  =  X^Xq^  .  The  eigenvalues  of  this  matrix  always  satisfy  p  <  1 . 

To  reduce  the  number  of  spatial  degrees  of  freedom  in  some  applications,  the  data  are 
subjected  to  a  truncated  empirical  orthogonal  function  (EOF)  expansion,  and  the  POP 
analysis  is  applied  to  the  vector  of  the  first  EOF  coefficients.  A  positive  by-product  of  this 
procedure  is  that  noisy  components  can  be  excluded  from  the  analysis.  Then,  the 
covariance  matrix  Xq  has  a  diagonal  form. 

If  there  is  a  priori  information  that  the  expected  signal  is  located  in  a  certain  frequency 
band,  it  is  often  advisable  to  time-filter  the  data  prior  to  the  POP  analysis.  A  somewhat 
milder  form  of  focusing  on  selected  time  scales  is  to  derive  the  EOFs  from  time-filtered 
data  and  then  to  project  the  unfiltered  data  on  these  EOFs. 

Criteria  to  decide  whether  a  POP  contains  useful  information  or  if  it  should  be  regarded  as 
reflecting  mostly  sample  properties  are  given  by  von  Storch  et  al.  (1988).  The  most 
important  rule-of-thumb  is  related  to  the  cross  spectrum  of  the  POP  coefficients  r'  and  z^: 
at  the  POP  period  T,  or  at  least  in  the  neighborhood  of  T,  the  two  time  series  should  be 
significantly  coherent  and  90*^  out  of  phase,  according  to  Eq.  (6). 

Invariance  against  coordinate  transformations.  If  the  original  time  series  x(t)  is 
transformed  into  another  time  series  y(/)  by  means  of  y(/)  =  £  •  x{t)  with  an  invertible 
matrix  £,  (i.e.,  £'^  exists),  then  the  eigenvalues  are  unchanged  and  the  eigmvectors 
transform  as  x: 


J2lx  =  X,X-‘;J4Y  =  y,y-' 
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with  y  j  =  £(y(t  +  l)y^(r)  =£X\£^  and  Vq  =  £Xq£^.  Thus  =  £^x^'^  If  Px  an 
eigenvector  of  with  eigenvalue  X,  i.e.,  Xp^-then  ^Pa" 

eventually  JLAx£'^^£^)^  =  XCXpy).  That  is,  if  py  is  a  POP  of  the  time  series  x,  then 
£px  =  py  is  a  POP  of  y  with  the  same  eigenvalue  X. 

The  EOFs  are  not  invariant  against  linear  transformations  £,  since  in  general  the  matrices 
Xq  and  £Xq£^  will  have  different  eigenvalues  and  eigenvectors.  Therefore,  if  the  POP 
analysis  is  begun  with  a  projection  of  the  data  on  a  truncated  EOF  expansion,  the  results 
of  a  POP  analysis  will  change  if  the  data  are  transformed  into  another  coordinate  system. 

The  POP  coefficients.  To  get  the  POP  coefficients,  z(/),  two  approaches  are  possible. 
One  is  to  derive  the  adjoint  patterns  p^  and  to  use  Eq.  (8).  An  alternative  is  to  not  derive 
adjoint  patterns  but  to  derive  the  coefficients  r  by  a  least-square  fit  of  the  data  x  by 
minimizing 


x-z-p-[z-p]*  =||x-zV-z'p' 


(12) 


if  p  is  complex,  or 


x-zp 


(13) 


2.2  POPs  =  Trivial  Case  of  PIPs 

State  space  models.  Many  complex  dynamical  systems,  x  €  /?",  may  conveniently  be 
approximated  as  being  driven  by  a  simpler  dynamical  system,  z  e  /?",  with  a  reduced 
number  of  degrees  of  freedom,  m<n.  Mathematically,  this  may  be  described  by  a  state 
space  model  which  consists  of  a  system  equation 

z(t+l)=.;F[z(0,a,t]-i-noise,  (14) 

for  the  dynamical  variables  z  =  (z, . ,z„ )  and  an  observation  equation 

x{t)=Tz{t)  +  noise  =  ^  Zj  (OPy  +  noise  (15) 

J 

for  the  observed  variables  x.  T  is  the  matrix  whose  columns  are  the  vectors,  or  patterns, 
Pj.  In  general  T  is  not  a  square-matrix.  J^[zit),a,t]  denotes  a  class  of  models  which  can 

be  nonlinear  in  the  dynamical  variables  z  and  which  depends  additionally  on  a  set  of  free 
parameters  a  =  (a, ,a2, ..'.).  Both  equations,  Eqs.  (14,15),  are  disturbed  by  an  additive 
noise. 
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Since  m  ^  n,  the  time  coefficient  Zj{t)  of  a  pattern  py  at  a  time  t  is  not  uniqudy  determined 
by  the  x(/).  Instead,  it  may  be  obtained  by  a  least-square  fit,  i.e., 

z(r)  =  i'p^'py^  P^x(0  ( 1 6) 

The  intriguing  aspect  of  state  space  models  is  that  the  dynamical  behavior  of  complex 
systems  often  appears  to  be  dominated  by  the  interaction  of  only  a  few  characteristic 
patterns  Py.  That  is,  even  if  the  dynamics  of  the  fiill  system  are  restricted  to  the  subspace 
spanned  by  the  columns  of  T,  its  principal  dynamical  properties  are  rq)resented. 

PIPs.  When  fitting  the  state  space  model  Eqs.  (14,15)  to  a  time  series,  the  following 
entities  have  to  be  specified;  the  class  of  models  the  patterns  the  free  parameters  (X, 

and  the  dimension  of  the  reduced  system  m.  The  class  of  models  ^  has  to  be  selected  a 
priori  on  the  basis  of  physical  reasoning.  Also,  the  number  m  might  be  specified  a  priori. 
The  parameters  a  and  the  patterns  T  are  fitted  simultaneously  to  a  time  series  by 
requesting  them  to  minimize 

€  [T;a]  =  E I  x(r + 1)  -  x(t)  -  TCF[z(tl  a,  t]  -  i(r  ))f  (17) 

where  6  [T.a]  is  the  mean  square  error  of  the  approximation  of  the  (discretized)  time 
derivative  of  the  observations  x  by  the  state  space  model.  The  patterns  T,  which  minimize 
Eq.  (17),  are  called  principal  interaction  patterns  (Hasselmann,  1988).  If  only  a  finite  time 
series  of  observations  x  is  available,  the  expectation  E  is  replaced  by  a  summation  over 
time. 

In  general,  the  minimization  of  Eq.  (17)  is  not  unique.  In  particular,  the  set  of  patterns 
T'  =  T  £  with  any  nonsingular  squared  matrix  £  will  minimize  Eq.  (17),  if  !P  does,  as 
long  as  the  corresponding  model  =  £-^JF  belongs  to  the  a  priori  specified  model  class. 
This  pro^  iem  may  be  solved  by  requesting  the  solution  to  fulfill  some  constraints,  e.g.,  that 
the  linear  term  in  the  Taylor  expansion  of  ^  is  a  diagonal  matrix. 

POPs  as  PIPs.  The  principal  oscillation  patterns  can  be  understood  as  a  kind  of  simplified 
principal  interaction  patterns.  For  that  assume  m  =  n.  Then,  the  patterns  T  span  the  full  x- 
space,  and  their  choice  does  not  affect  e  [P;a].  Also,  let  ^  be  a  linear  model 
J{i{t),OL\  =  '  2(/),  where  the  parameters  a  are  the  entries  of  j\.  Tnen  the  dynamical 

equation  Eq.  (14)  is  identical  to  Eq.  (1 1).  The  construnt  mentioned  above  leads  to  the 
eigenvectors  of  ^  as  being  the  PIPs  of  the  particular,  admittedly  simplified,  state  space 
model. 
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2.3  Associated  Correlation  Patterns 

Definitioii  and  representation.  The  associated  correlation  pattern  analysis  (von  Storch 
et  al.,  1988)  is  a  regression  analysis  to  infer  the  spatial  properties  of  a  signal  which  is 
encoded  in  a  two-dimensional  index  (a  compile  POP  coefficient,  for  instance).  If  the 
parameter  under  consideration  is  Y(r)  and  the  bivariate  index  is  (z‘(r).z^(f))  the  two 
associated  correlation  patterns  and  q^  minimize 

The  normalization  with  yfl  in  Eq.  (18)  has  been  introduced  so  that  q'  represents  a  typical 
state  for  rJ(/)  =  1,  z^t)  =  0  and  q^  a  typical  state  for  r*(/)  =  0,  z^t)  =  1 .  The  solution  of 
Eq.  (18)  is  straightforward  and  requires  the  solution  of  a  2  x  2  linear  equation  at  each 
location  r  of  the  input  field  Y  =  (y,) . 

The  associated  correlation  patterns  can  be  displayed  directly  by  the  two  pattons  q^  and  q^ 
or  by  amplitude  distributions  and  phase  distributions  (Figure  6).  The  amplitude  A  and  the 
phase  ^  at  the  location  r  is  given  by 


A  =  (19) 


with  Tbeing  the  period  of  the  mode.  The  phase  yt  has  been  defined  such  that  ^  =  0 
coincides  with  z^  =  0  and  z*  >  0,  and  yt  =  T/4  with  z’  =  0  and  z^  <  0  (compare  with 
Eq.  (7)). 

Measure  of  skill.  A  number  measuring  the  relative  importance  of  a  POP  for  a  parameter 
at  the  location  r  is  the  rate  of  explained  y^  variance  by  the  index  (z^'Z^).  This  rate  is 
given  by 


£(y,.z‘.z^)  = 


Var(y,)-£? 
Var(y, ) 


(21) 


with 
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^=I 

t 

being  the  local  error  in  Eq.  (18);  € 
without  skill. 

3  POP  ANALYSIS  OF  THE  INTRASEASONAL  VARIABILITY 
IN  THE  EQUATORIAL  PACIFIC 

General.  The  general  analysis  strategy  is  first  to  derive  an  index  of  the  equatorial  modes 
through  a  principal  oscillation  pattern  (POP)  analysis  of  the  equatorial  current  meto’ 
moorings  at  165°E,  140®W,  and  1 10°W.  The  time  series  at  these  stations  are  relatively 
long  and  sample  the  equator  fairly  well.  Zonal  currents  and  temperatures,  which  ought  to 
reflect  equatorial  Kelvin  waves  well,  as  well  as  meridional  currents  are  monitored  by  these 
buoys.  After  having  established  that  the  index  makes  sense,  all  available  data  fi'om  the 
current  meter  moorings  and  fi'om  the  ATLAS  buoys  are  examined  in  an  “associated 
correlation  pattern”  analysis.  The  purpose  of  this  exercise  is  to  infer  the  3 -dimensional 
spatial  structure  of  the  modes. 

3  .1  Raw  Data 

For  the  analysis,  daily  observations  were  available  from  two  series  of  moored  buoys 
(Hayes  et  al.,  1991); 

•  Current  meter  moorings  at  four  locations,  the  exact  positions  of  which  are  given  in 
Table  1.  These  buoys  recorded  zonal  and  meridional  currents  and  temperature  at 
various  levels  and  near  surface  air  temperature  and  zonal  and  meridional  wind. 

•  ATLAS  buoys  located  at  20  positions  in  the  near-equatorial  Pacific  (for  the  exact 
positions,  see  Table  1).  From  these  buoys,  subsurface  temperatures  at  various  levels, 
as  well  as  near  surface  ^r  temperature  and  wind,  are  available. 

The  shortest  time  series  is  from  147°E,  5°N  (9  months).  Maximum  length  is  7  years  (at  0°, 
1 10®W  and  140°W). 

Mean  State.  The  buoy  data  represent  a  good  data  base  to  sketch  the  mean  distribution  of 
currents  and  temperature  in  the  equatorial  Pacific.  In  Figure  3  are  plotted  the  mean  zonal 
current  and  temperature  distributions  along  the  equator  as  well  as  latitude-depth  cross- 
sections  of  temperature  along  16S°E  and  1 10^.  The  mean  equatorial  temperature 
distribution  is  dominated  by  the  sharp  thermocline  that  separates  water  of  10-1  S°C  at 
deeper  layers  fi'om  warm  surface  waters  of  24®C  in  the  east  and  28®C  in  the  west.  If  we 
identify  the  thermocline  with  the  20®C  isotherm,  then  the  thermocline  rises  from  180  m  at 


=  1  indicates  a  perfect  model  and  £  =  0  a  model 
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165°E  to  100  m  at  140°W  to  60  m  at  1 10°W.  The  zonal  current  is  weakly  westward  at  the 
surface  with  maximum  values  below  2S  cm/s.  Maximum  eastward  flowing  currents,  the 
Equatorial  Undercurrent^  prevail  along  the  thermocline,  with  maximum  values  at  about 
the  17.5°C  isotherm.  At  165°E  the  maximum  current  is  below  50  cm/s,  at  140°W 
maximum  speeds  are  100  cm/s,  and  at  1 10°W  above  75  cm/s. 

Maximum  temperatures  prevail  north  of  the  Equator  in  the  east  and  south  of  the  Equator 
in  the  west.  The  thermal  wind  relationship  is  nicely  reflected  in  the  mean  distributions  (Fig. 
3a,  c  and  d  ). 


Table  1 .  Position  of  buoys  from  which  data  have  been  used  in  the  present  study.  Also 
given  is  the  maximum  time  interval  for  which  at  least  one  variable  is  available. 


Instrument 

Longitude 

Latitude 

Data  interval 

Parameters 

CMM 

0° 

165°E 

5/86  -  4/91 

current,  temperature,  wind 

CMM 

0° 

140°W 

5/84  -  4/91 

CMM 

0° 

110°W 

5/84  -  4/91 

CMM 

7®N 

110°W 

5/88  -  4/91 

ATLAS 

5°N 

147°E 

5/90-2/91 

temperature,  wind 

ATLAS 

8®N 

165°E 

5/90  -  4/91 

ATLAS 

5®N 

165°E 

7/88  -  4/91 

ATLAS 

2°N 

165'’E 

7/87  4/91 

ATLAS 

2®S 

165°E 

5/86  -  4/91 

ATLAS 

5°S 

165°E 

7/87  -  4/91 

ATLAS 

0° 

169°W 

5/88  -  4/91 

ATLAS 

7®N 

147‘’W 

11/88-11/90 

ATLAS 

9°N 

140°W 

5/88  -  4/91 

ATLAS 

5°N 

140°W 

5/88  -  4/91 

ATLAS 

2‘TSf 

140°W 

5/87  -  4/91 

ATLAS 

2°S 

140°W 

5/87  -  4/91 

ATLAS 

5°S 

140°W 

10/90  -  4/91 

ATLAS 

7°N 

132®W 

5/89  -  10/90 

ATLAS 

0° 

124°W 

5/87  -  4/91 

ATLAS 

5‘T^ 

110®W 

5/86  -  4/91 

ATLAS 

2°N 

110°W 

6/85  -  4/91 

ATLAS 

2°S 

110®W 

5/85  -  4/91 

ATLAS 

5°S 

110°W 

5/86  -  4/91 

ATLAS 

8°S 

llO^W 

5/86  -  6/87 

Variability  around  the  annual  cycle.  The  annual  cycles  have  been  removed  from  all 
data.  To  also  exclude  part  of  the  Southern  Oscillation-related  variability,  this  removal  of 
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the  annual  cycle  was  done  for  each  May-to-April  segment  separately.  The  May-to-April 
segments  were  chosen  to  represent  one  “El  Nifio  year”  (Wright  ,1985).  As  an  example, 
three  variables  at  0°,  140°W  are  shown  before  and  after  the  removal  of  the  low-frequency 
variability  (Figure  4). 

At  the  equatorial  buoy  all  parameters  undergo  marked  variations  on  the  interannual  time- 
scale,  some  of  which  stem  mostly  from  the  regular  annual  cycle  (e  g.,  the  zonal  wind).  In 
the  subsurface  variables  the  irregular  ENSO-related  variations  contribute  most  to  the  low 
frequency  variability.  The  high-frequency  variations  are  normally  distributed.  In  the  zonal 
wind  the  intraseasonal  variations  are  almost  white  in  time,  whereas  the  subsurface 
parameters  exhibit  an  oscillatory  behavior  with  typical  periods  of  50-100  days.  The  zotud 
current  seems  to  lead  the  temperature  by  a  few  days. 

3.2  The  POP  Analysis  of  the  Equatorial  Current  Meter  Mooring  Data 

Preprocessing  of  the  data.  In  the  data  field  to  be  analysed,  we  have  parameters  that  differ 
with  respect  to  units  as  well  with  respect  to  their  standard  deviations.  To  allow  all 
parameters  to  play  the  same  role  in  the  analysis,  all  data  are  standardized  to  zero  mean  and 
standard  deviation  one. 

For  the  POP  analysis  it  is  often  helpful  if  the  data  are  preprocessed  prior  to  the  analysis 
with  the  purpose  of  suppressing  space-time  noise  (see  section  2.1).  The  spatial  noise  is 
taken  out  by  doing  the  analysis  in  a  low-dimensional  subspace  spanned  by  the  first  few 
EOFs,  and  the  temporal  variations  on  time  scales  irrelevant  for  the  process  under 
investigation  are  taken  out  by  a  time  filter. 

The  data  are  first  subjected  to  an  EOF  analysis.  In  this  EOF  analysis  the  entries  ),y  of  the 
correlation  matrix  have  been  estimated  from  all  available  pairs  of  observations,  i.e., 

yi=  — SP|<')P;(/)  (22) 

where  Pf(t)  represents  the  /-parameter  of  the  data  field  X(/)  at  time  t.  T'y  is  the  set  of  all 
times  when  both  and  Pj  have  been  observed  and  rty  is  the  number  of  elements  in  Ty 
Definition  Eq.  (22)  is  adequate  for  the  case  of  gappy  data.  Only  those  pairs  of  indices  (ij) 
were  considered  for  which  ny  was  at  least  50%  of  all  possible  observations. 

The  EOF  coefficients  a^O  are  then  no  longer  gven  as  the  dot  product  of  the  field  X(r)  at 
time  i  and  the  respective  EOF  e*  but  are  determined  as  a  least-square-fit 

I  X(r)  -  a*  (r)  X  e*  |  =  min . 


(23) 
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Figure  3.  Mean  distributions  derived  from  the  buoy  data.  The  20°C  isotherm  in  the  temperature 
distributicms  (in  10*^  "C)  and  the  zero  line  in  the  current  distnlnition  (in  10*^  m/s)  are  given  as  heavy 
lines.  Top:  Longitude-dq)th  cross  sections  of  temperature  and  zonal  current  along  the  equator.  Bottom: 
Latitude-depth  cross-sections  of  temperature  along  165°E  and  1 10°W. 
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The  EOF  coefficients  a^(t)  represent  complete  time  series  over  the  entire  7-year  time 
interval  from  May  1984  through  April  1991.  These  time  series  are  time-filtered  such  that 
all  variability  below  10  days  and  above  180  days  is  completely  eliminated  and  all  variability 
on  time  scales  between  40  and  150  days  is  not  affected.  In  the  windows  between  10  and 
40  days  and  150  and  180  days  the  filter  response  function  smoothly  changes  from  0  to  1 

Results  of  POP  analysis.  Two  oscillatory  modes  are  identified  whose  coefficient  time 
series  exhibit  the  desired  high  coherency  and  90°-out-of-phase  relationship.  In  Figure  5  the 
amplitude  time  series  of  the  two  complex  POP  coefficients  are  plotted.  Note  that  the 
coefficient  time  series  have  been  normalized  so  that  Var(z'  (r))  =  1 .  The  coefficients  were 
obtained  by  means  of  the  adjoint  patterns  and  Eq.  (8) 

One  mode  has  a  POP  period  7  =  65  days,  and  an  e-folding  time  t  =  73  days.  It  represents 
about  16%  of  the  variance  of  the  band-pass  filtered,  EOF-truncated  and  normalized  data 
(at  all  three  locations,  for  temperature,  zonal,  and  meridional  currents  as  well  as  winds, 
and  at  all  depths).  In  consistency  with  the  POP  period  the  maximum  coherence  is  obtained 
for  60  days.  The  amplitude  time  series  reveals  a  marked  annual  cycle,  with  a  definite 
appearance  of  a  semiannual  component.  The  wave  activity  is  strongest  during  solstice 
conditions  and  minimum  activity  during  equinoctial  conditions. 

The  second  mode  has  an  e-folding  time  of  106  days  and  a  POP  period  7=  120  days.  But 
the  POP  coefficients  z^(t)  and  2^(t)  have  largest  coherencies  at  72  days,  so  that  the  POP 
period  of  120  days  likely  is  an  overestimate  of  the  true  oscillation  period.  The  POP 
coefficient  represents  18%  of  the  variance  of  the  band-pass  filtered,  EOF-truncated,  and 
normalized  data.  The  amplitude  time  series  in  Figure  5  are  hardly  affected  by  the  annual 
cycle.  Instead  the  modification  of  the  large-scale  environment  through  the  development  of 
warm  El  Niflo  conditions  leaves  a  clear  mark  on  the  time  series.  During  the  warm  event  in 
1986/87  and  the  early  phase  of  the  warm  event  in  1990/91  the  activity  of  the  waves  is 
enhanced. 

The  two  modes  are  only  weakly  correlated.  The  correlations  between  the  real  and 
imaginary  parts  of  the  coefficient  time  series  are  very  small,  and  the  correlations  between 
the  real  (imaginary)  parts  of  the  two  modes  are  about  -0.25. 

3.3  The  Spatial  Signature  of  the  Mode 

General.  In  the  present  study,  associated  correlation  patterns  have  been  computed  from 
various  parameters  for  both  modes  separately.  In  all  cases  the  annual  cycle,  as  represented 
by  the  first  two  annual  harmonics  and  the  overall  mean  of  each  May-to-April  segment,  has 
been  removed  prior  to  the  analysis.  No  more  time-smoothing  was  done  because  of  the 
wide  gaps  in  the  data.  An  implicit  time-filtering  has  been  introduced  through  the  use  of  the 
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POP  coefficient  time  series.  Since  these  time  series  have  been  derived  from  time-filtered 
data  (see  above),  they  are  themselves  smooth.  Unlike  the  POP  analysis,  the  data  are  not 
normalized  for  the  associated  correlation  pattern  analysis. 

Currents  at  the  current  meter  moorings.  The  longitude-depth  distributions  of  the 
amplitudes  and  phases  of  the  two  intraseasonal  modes,  with  POP  periods  of  120  days  and 
65  days,  are  shown  in  Figure  6  for  the  zonal  current.  Both  modes  represent  eastward 
propagating  signals. 

The  120-day  mode  has  its  largest  amplitudes  in  the  central  part  of  the  tropical  Pacific,  with 
maximum  values  of  16  cm/s,  as  typical  anomalies,  at  SO  m  depth  at  16S°E  and  160  m 
depth  at  140°W  In  contrast,  the  65-day  mode  has  maximum  zonal  current  anomalies  at 
upper  levels  (50  m  and  above)  in  the  eastern  part  of  the  basin,  with  a  typical  maximum  of 
12  cm/s  at  140°W  and  19  cm/s  at  1 10°W. 

In  the  120-day  mode,  the  zonal  current  signals  need  about  60  days  to  propagate  from  the 
165°E  buoy  to  the  easternmost  buoy  at  1 10°W.  If  we  accept  the  estimate  of  120  days  as  a 
period,  then  the  mean  phase  speed  is  1.8  m/s.  This  number  is  increased  to  2.4  m/s  or  3  .0 
m/s  if  the  period  is  set  to  90  or  even  72  days  (see  above).  The  phase  lines  are  vertically 
tilted  at  165°E  and  140®W,  with  the  upper  levels  lagging  the  lower  levels  by  about  45°  or 
15  days  (of  a  120-day  period). 

The  phase  speed  for  the  65-day  mode  is  estimated  to  be,  on  an  average,  2. 1  m/s.  At  the 
two  eastern  positions,  the  phase  lines  are  again  tilted,  with  the  lower  levels  leading  the 
upper  levels  by  about  45°  or  8  days  (of  a  65-day  period).  Maximum  explained  local 
variance  of  the  zonal  current  field  is  40%  at  120  m  at  140°W  for  the  120-day  mode  and 
20%  at  120  m  at  1 10°W  for  the  65-day  mode. 

Current  information  is  also  available  for  one  off-equatorial  location  from  the  7°N,  140°W 
buoy.  Here  a  maximum  of  7%  of  explained  variance  is  obtained  for  the  120-day  mode  at 
40  m,  where  an  amplitude  of  5.4  cm/s  is  found  (not  shown).  Thus  the  signal  is  weak  at 
that  location,  but  interestingly  the  sign  at  7°N  is  opposite  to  that  at  the  equator  (not 
shown).  A  similar  result  is  found  for  the  65-day  mode. 

In  the  meridional  current  the  signal  is  negligible  for  the  120-day  mode,  but  a  well-defined 
signal  is  identified  in  the  6S-day  mode.  Maximum  percentages  of  explained  local  variance 
are  12%  at  120  m  and  160  m  at  1 10°W.  A  maximum  amplitude  of  10  cm/s  near  the 
surface  lags  an  amplitude  of  about  8  cm/s  at  lower  levels  by  about  10  days  (not  shown). 
The  phase  relationship  with  the  zonal  current  is  that  northward  meridional  current 
anomalies  lead  easterly  zbnal  current  anomalies  by  10  days  or  so.  An  alternative 
interpretation  is  that  easterly  current  anomalies  lead  southward  current  anomalies  by  20 
days  or  so. 
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Figure  6.  Longitude-depth  cross-section  of  the  zonal  currents  of  the  120-day  and  65-day  mode  along  the 

equator.  Top.  The  amplitudes  A  in  10'^  cm/s,  and  Bottom:  The  phases  y  in  days  (relative  to  base  periods 
of  120  or  65  days). 
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Temperature  at  all  buoys.  For  temperature,  the  amplitude  distributions  A  and  phase 
distributions  ^  are  shown  as  three  cross  sections  through  the  tropical  Pacific;  a  longitude- 
depth  cross  section  along  the  equator  (Fig.  7),  a  latitude-depth  cross  section  at  1 10°W 
(Fig.  8),  and  a  longitude-latitude  cross  section  at  100  m  (Fig.  9). 

Maximum  temperature  amplitudes  of  both  modes  cluster  along  the  thermocline  (Fig.  3  a) 
with  maximum  values  of  more  than  1°C  (Fig.  7).  Overall,  the  temperature  signal  of  the 
120-day  mode  is  stronger  than  that  of  the  65-day  mode.  The  temperature  signals 
propagate  like  the  zonal  current  signals  eastward  along  the  equator.  The  120-day 
temperature  signal  travels  over  the  basin  in  about  90  days  (relative  to  a  base  period  of  120 
days)  so  that  the  phase  speed  of  temperature  is  1.5  times  that  of  the  zonal  current.  At  the 
165°E  buoy,  the  temperature  and  zonal  current  signals  are  almost  in  phase  so  that  the  later 
phase  lags  must  stem  from  different  travel  times.  The  propagation  of  the  temperature 
signal  of  the  65-day  mode  is  mostly  parallel  to  that  of  the  zonal  current  signal  but  there  is 
a  uniform  lag  of  about  10  days. 

The  latitude-depth  cross  sections  of  the  associated  correlation  patterns  at  1 10°W  reveal 
maximum  amplitudes  of  more  than  1°C  at  about  100  m  depth.  In  both  modes  are  a  marked 
amplitude  minimum  at  2°N  and  a  maximum  at  6°N.  The  activity  of  the  120-day  mode  is 
largest  south  of  the  equator,  with  a  maximum  amplitude  of  1 .4°C  at  2®S,  whereas  the  65- 
day  mode  has  its  largest  amplitude  of  1 .4°C  at  6°N.  Both  modes  exhibit  complicated  phase 
distributions.  In  the  120-day  mode  the  phase  varies  mostly  between  60  days  at  deeper 
levels  and  90  days  at  upper  levels.  Only  along  the  minimum  at  2®N  the  phase  is  markedly 
lagging  its  neighborhood  by  30  or  more  days.  In  the  65-day  mode  the  maximum  at  6°N  is 
180®  out  of  phase  with  the  temperature  signal  at  the  equator  which,  in  turn,  lags  the 
secondary  maximum  at  2°S  by  another  10  to  15  days. 

Figure  9  shows  the  latitude-longitude  distributions  of  the  amplitudes  and  phases  of  the  two 
modes  in  100  m  depth.  Maximum  amplitudes  of  the  order  of  1®C  at  140°W  at  the  equator, 
where  the  thermocline  is  close  to  100  m,  tend  to  appear  simultaneously  with  even  larger 
(«2°C)  anomalies  with  opposite  sign  at  7°N.  The  eastward  propagation  is  clearly  visible  in 
the  65-day  mode,  whereas  in  the  120-day  mode  the  eastward  propagation  seems  to  be 
limited  to  the  area  west  of  140°W.  The  isolated  amplitude  maximum  at  5®N,  147°E  should 
not  be  taken  too  seriously  because  of  the  shortness  of  the  time  series  at  that  location  (see 
Table  1). 
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Figure  7.  Longitude-depth  cross-section  of  temperature  of  the  65  day  mode  and  of  120  day  mode  along  the 
equator.  Top:  The  amplitude  distributions  .4  in  10*^  *’C.  Bottom:  The  phase  distributions  y  (in  days 
relative  to  the  base  periods  of  120  and  65  days). 
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Figure  9.  Horizontal  distribution  of  tempetature  at  a  depth  of  100  m  of  the  120  day  mode  and  of  the  65 
day  mode.  Top;  Amplitude  distributions^  in  lO*^  ®C.  Bottom:  Phase  distributions  y  (in  days  relative  to 
the  120  day  and  65  day  base  periods). 

Discussion:  Equatorial  temperature  anomalies  and  advection.  Because  of  the  marked 
spatial  gradients  in  the  mean  temperature  field  (Fig.  3)  the  temperature  advection  with  the 
anomalous  zonal  currents  might  contribute  significantly  to  the  creation  of  temperature 
anomalies.  Estimates  of  such  temperature  anomalies  may  be  obtained  for  the  equator  since 
information  on  the  currents  is  available  there.  If  the  anomalies  are  labelled  by  a  •  and  the 
mean  state  by  a  “  then  the  effect  of  the  anomalous  currents  on  the  temperature  is 
approximated  by 


“  dx'^"'  dyj 


T 
X  — 
2 


(24) 


with  T,  u,  and  v  representing  the  temperature  and  zonal  and  meridional  currents,  and  7  the 
period;  x  refers  to  the  zonal  direction  and^  to  the  meridional  direction.  In  the  following 
we  consider  the  situation  at  MO^W  at  120  m  depth. 

The  zonal  gradient  of  the  mean  T  is  approximately  2x10“*  K/cm  (Fig.  3).  For  the  120 
day  mode  the  anomalous  zonal  current  is  10  cm/s  (Fig.  6)  and  the  period  is  somewhere 
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between  80  to  120  days.  Equation  (24)  yields  with  these  numbers  a  temperature  anomaly 
between  0.6  and  1 .0°C,  which  compares  well  with  the  result  of  1 .0®C  in  Fig.  7.  The  120 
day  mode  is  not  connected  with  significant  anonudies  of  the  meridional  current.  Thus  this 
back-of-the-envelope  calculation  Eq.  (24)  proposes  that  the  equatorial  temperature 
anomalies  are  due  to  anomalous  zonal  advection.  This  hypothesis  is  supported  by  the 
different  travel  times  of  the  temperature  and  zonal  current  signal,  which  was  found  in  a 
numerical  experiment  on  the  the  response  of  the  tropical  Pacific  to  westerly  wind  bursts 
(Latif  et  al.,  1988). 

The  typical  zonal  current  anomalies  of  the  65  day  mode  are  only  S  .4  cm/s  at  120  m  (Fig. 

6)  and  the  characteristic  time  772  is  only  32  days.  Thus  the  effect  of  zonal  advection  is 
estimated  as  0.3°C,  which  is  significantly  less  than  the  predicted  0.9®C  (Fig.  7).  Thus  zonal 
advection  cannot  fully  explain  the  observed  temperature  anomalies — which  is  consistent 
with  the  coincidence  of  the  temperature  and  zonal  current  travel  times.  The  65-day  mode 
exhibits,  however,  a  significant  signal  in  the  meridional  current  which  could  account  for 
equatorial  temperature  anomalies  of  0.3®C. 

Kelvin  waves':  Are  the  modes  identified  and  described  so  far  what  people  call  Kelvin 
waves  (Moore  and  Philander,  1977)?  The  vertical  structure  of  the  modes  along  the 
equator,  the  horizontal  scale,  the  eastward  propagation  and  the  time  scale  are  broadly 
consistent  with  the  concept  of  equatorial  Kelvin  waves.  But  several  aspects  are 
inconsistent  with  this  concept.  There  are  two  modes,  which  have  similar  vertical 
structures,  similar  horizontal  scales  and  time  scales,  that  certainly  cannot  be  accounted  for 
as  the  first  two  Kelvin  modes.  The  presence  of  a  signal  in  the  meridional  signal  in  the  65 
day  mode  does  not  fit  the  specification  of  a  Kelvin  wave  nor  has  the  rich  structure  found 
off  the  equator  yet  been  described  by  the  theory  of  equatorial  Kelvin  waves. 

Johnson  and  McPhaden  (1993)  analyzed  five  years  (1983-87)  of  current  and  temperature 
data  from  the  140°W  and  1 10°W  equatorial  moorings  and  seven  months  of  data  from 
bouys  at  2°S,  0°  and  at  2°N,  140°W.  They  used  the  complex  empirical  orthogonal 
functions  (CEOFs,  see  also  Section  4)  and  found  one  dominant  mode  that  was  broadly 
consistent  with  the  idea  of  a  first  baroclinic  Kelvin  wave.  The  main  differences  from  a 
conventionally  defined  Kelvin  wave  were  these; 

•  A  local  maximum  and  a  local  minimum  of  the  zonal  velocity  below  and  above  the  core 
of  the  equatorial  undercurrent.  This  results  holds  for  both  modes  identified  in  the  POP 
analysis. 

•  An  equatorial  minimum  of  the  temperature  signal  at  the  thermocline  is  straddled  by 
two  maxima  at  2°S  and  2®N.  In  the  present  POP  analysis,  on  the  other  hand,  the 
maximum  at  2°S  is  reproduced,  but  north  of  the  equator  at  2°N  a  well-defined 
minimum  is  identified.  Possibly  Johnson  and  McPhaden's  (1993)  result  is  due  to  the 
short  analysis  period  of  only  210  days. 
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•  A  nonzero  temperature  signal  at  the  surface  lags  the  zonal  current  signal  at  the  surface 
and  the  temperature  signal  at  the  thermocline  by  90*^.  This  result  is  confirmed  by  the 
POP  analysis,  in  particular  for  the  120  day  mode. 

The  biggest  difference  from  Johnson  and  McPhaden  (1993)  is  the  presence  of  two  modes 
which  have  uncorrelated  coefficient  time  series  but  share  substantial  similarities  in  their 
spatial  appearance.  A  reason  for  this  difference  might  lie  in  the  different  analysis 
techniques.  Johnson  and  McPhaden  (1993)  used  CEOFs  so  that  any  two  modes  must  be 
orthogonal  in  space  whereas  the  POP  analysis  does  not  require  orthogonality.  If  there  are 
two  orthogonal  modes  (T^.Ui )  (/  =  1,2)  with  temperature  signals  T  and  zoital  current 
signals  u  the  orthogonality  requires 


T,^Tj+ufuj=0.  (25) 

Because  of  the  sharp  thermocline  in  the  east  equatorial  Pacific  the  largest  temperature 
anomalies  will  be  centered  around  the  thermocline  so  that  T|  ~  T2.  Thus  to  satisfy  Eq. 

(25)  a  negative  correlation  of  the  current  signals  is  needed,  i.e.,  U]  — U2.  This  latter 
condition  represents  a  severe  limitation  without  any  physical  justification.  Therefore  I 
speculate  that  the  CEOF  technique  could  not  easily  be  used  to  identify  two  orthogonal 
modes  in  the  equatorial  (T,u)  data.  This  (admittedly  handwaving)  argument  might  help  to 
resolve  the  apparent  contradiction  of  only  one  mode  in  Johnson  and  McPhaden  (1993)  but 
two  modes  in  the  POP  analysis.  On  the  other  hand,  there  is  no  support  in  the  literature  (as 
far  as  I  know)  for  the  idea  of  two  non>orthogonai  modes. 

The  65  day  mode  is  not  envisaged  by  the  theory  of  equatorial  wave  dynamics.  This  theory 
deals  with  the  growth  of  small  disturbances  and  not  with  the  development  or  breakdown 
of  finite  amplitude  disturbances.  Schnur  et  al.  (1993)  have  shown,  for  the  case  of  synoptic- 
scale  disturbances  in  the  extratropical  troposphere,  that  the  POP  analysis  is  an  adequate 
tool  to  obtain  not  only  the  normal  modes  of  a  dynamical  system  but  also  modes  that 
represent  finite  amplitude  phases  in  the  full  spectrum  of  variability.  I  speculate  that  the  65 
day  mode  might  represent  such  a  finite  amplitude  mode.  It  remains  to  be  clarified  if  the 
results  of  this  study  will  stand  the  test  of  more  data,  longer  time  series,  and  closer  scrutiny. 
However,  one  has  also  to  keep  in  mind  that  the  present  theory  of  equatorial  Kelvin  waves 
is  based  on  a  number  of  severe  simplifications,  one  being  the  horizontal  homogeneity  of 
the  background  state. 

4.  CONCLUSIONS 

The  purpose  of  the  present  paper  is  two-fold.  The  main  point  is  to  introduce  the  POP 
technique  to  the  oceanographic  community.  The  minor  point  is  to  present  first  results  from 
an  analysis  of  data  that  are  irregularly  distributed  in  space  and  time. 
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The  POP  technique.  The  POP  method  is  a  powerful  method  to  infer  simultaneously  the 
space-time  characteristics  of  a  vector  time  series.  The  basic  idea  is  to  isolate  low¬ 
dimensional  subsystems  that  are  controlled  by  the  linear  dynamics  of  the  full  system.  Even 
if  the  POP  method  represents  the  most  consistrat  way  of  doing  so,  there  are  other 
techniques  that  can  be  used  successfully  for  similar  purposes.  An  alternative  is  the  complex 
empirical  orthogonal  Junctions  (CEOFs;  Wallace  and  Dickinson  1972,  Barnett  and 
Preisendorfer  1981).  CEOFs  are  obtained  by  applying  the  conventional  EOF  technique  to 
a  complex  time  series  whose  real  part  is  the  real  time  series  that  has  to  be  analysed  and 
whose  imaginary  part  is  the  Hilbert  transform  of  that  real  time  series.  (CEOFs  are  related 
to  EOFs  just  like  complex  POPs  to  regular  POPs ).  The  main  difference  between  CEOFs 
and  POPs  is  that  CEOFs  are  constructed  under  the  constraint  of  a  maximum  of  explained 
variance  and  mutual  orthogonality.  The  characteristic  times,  the  period  and  the  damping 
time,  are  not  an  immediate  result  of  the  CEOF  analysis  but  have  to  be  derived  a  posteriori 
from  the  CEOF  coefficient  time  series.  The  POPs,  on  the  other  hand,  are  constructed  to 
satisfy  a  dynamical  equation  Eq.  (1 1),  and  the  characteristic  times  are  an  output  of  the 
analysis;  also  the  complex  POP  coefficients  z(t)  are  not  pairwise  orthogonal.  The  non¬ 
orthogonality  makes  the  mathematics  less  elegant,  but  it  is  not  a  physical  drawback, 
because  in  most  cases  there  is  no  reason  to  assume  that  different  geophysical  processes 
develop  statistically  independent  from  each  other.  The  rate  of  variance  explained  by  the 
POPs  is  not  optimal  and  has  to  be  calculated  after  the  POP  analysis  from  the  POP 
coefficients. 

The  POP  method  is  not  a  tool  that  is  useful  in  all  applications.  If  the  analysed  vector  time 
series  exhibit  a  strongly  non-linear  behaviour,  as  in  turbulent  flows,  the  POPs  will  fail  to 
identify  a  useful  sub-system,  simply  because  a  linear  sub-system  does  not  control  a 
significant  portion  of  the  variability.  The  POP  method  will  be  useful  if  there  are  a  priori 
indications  that  the  processes  under  consideration  are  to  a  first  approximation  linear. 

Equatorial  Waves.  We  have  found  two  modes  of  variability  in  the  equatorial  Pacific 
Ocean.  The  slower  mode,  with  a  nominal  period  of  120  days,  resembles  a  first  baroclinic 
Kelvin  wave.  The  other,  6S-day  mode  is  different  from  theoretically  derived  modes  and 
from  previously  empirically  derived  modes.  More  work  is  needed  to  ensure  the  reality  and 
the  signature  of  the  two  distinct  modes. 
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ILLUSTRATING  FREQUENTIST  AND 
BAYESIAN  STATISTICS  IN  OCEANOGRAPHY 


George  Casella,  Biometrics  Unit,  Cornell  Univmity,  Ithaca,  NY  14853 
ABSTRACT 

Both  frequentist  and  Bayesian  methodologies  provide  means  for  a  statistical  solution  to  a 
problem.  However,  it  is  usually  the  case  that,  for  a  ^ven  atuation,  one  methodology  is 
more  appropriate.  Using  a  number  of  oceanographic  examples  we  explore  the  components 
of  a  statistical  solution  and  illustrate  the  most  appropriate  methodology.  We  argue  that  the 
statistical  consideration  of  utmost  importance  is  the  type  of  inference  and  conclusion  to  be 
made.  In  some  examples  it  is  more  appropriate  to  make  this  inference  as  a  Bayesian,  and  in 
some  it  is  more  appropriate  to  make  this  inference  as  a  frequentist. 

“Still,  it  is  an  error  to  argue  in  front  ofvour  data.  You  find  yourself  insensibly  twisting 
them  round  to  fit  your  theories.  ” 

Sherlock  Holmes 
The  Adventure  of  Wisteria  Lodge 


1.  INTRODUCTION 

An  alternate  title  for  this  paper  might  well  be  “Conditional  and  Unconditional  Inference  in 
Oceanographic  Studies,”  as  a  fundamental  difference  between  frequentist  and  Bayesian 
statistics  is  their  resulting  inference.  A  frequentist  inference  is  unconditional,  ^plying  to  a 
series  of  repeated  experiments  (most  always  an  imagined  series).  In  contrast,  a  Bayesian 
inference  is  conditional,  applying  to  the  data  at  hand,  and  not  directly  addressing  the 
concept  of  repeatability. 

This  paper  is  an  introduction  to  these  methods  and  illustrates  their  uses  with  some 
oceanographic  data  sets.  The  primary  message  is  that  each  statistical  view  has  a  lot  to 
offer,  and,  depending  on  the  problem,  one  methodology  is  probably  more  iq)propriate.  We 
illustrate  this  through  the  examples. 

A  second  goal  of  this  paper  is  to  try  to  explain  to  the  oceanogr^hic  community  how  a 
statistician  approaches  a  problem.  The  purpose  of  this  endeavor  is  to  provide  a  structured 
iq)proach  to  dealing  with  problems  involving  data,  from  thdr  inception  to  ending.  In  doii^ 
so,  peili^s  the  task  of  dealing  with  the  ever-increasing  data  bases  can  be  made  a  little 
easier. 
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The  remainder  of  the  paper  is  arranged  as  follows.  In  Section  2  we  give  a  general  outline 
of  how  to  approach  a  problem  statistically,  illustrating  this  with  an  example  in  Section  3. 
Section  4  discusses  the  underlying  differences  between  the  frequentist  and  Bayesian 
approaches  to  statistics,  and  Sections  5  and  6  contain  more  examples  illustrating  these 
methodologies.  Section  7  is  a  concluding  discussion. 

2  COMPONENTS  OF  A  STATISTICAL  SOLUTION 

In  the  best  of  all  possible  worlds,  a  problem  is  planned,  statistically,  from  beginning  to  end. 
Chronologically,  the  steps  of  a  solution  are  these. 

1 .  Model  the  Process 

2.  Design  the  Experiment 

3.  Collect  the  Data 

4.  Estimate  and  Verify  the  Model. 

5 .  Infer  and  Conclude 

6.  Implement  the  Solution 

Although  the  steps  are  performed  in  chronological  order,  they  are  best  planned  in  reverse 
order.  That  is,  when  approaching  any  problem,  the  first  consideration  is  “How  will  the 
knowledge  we  gain  be  implemented?”  For  example,  if  a  study  is  proposed  to  examine 
wave  magnitude  and  direction  in  the  North  Atlantic,  the  first  consideration  should  be  the 
use  of  the  resulting  knowledge.  Will  it  be  used  to  plan  routes  for  oil  tankers?  Will  it  be 
used  to  increase  our  basic  knowledge  of  ocean  dynamics?  By  answering  these  questions 
first,  the  remainder  of  the  steps  of  a  statistical  solution  will  fall  into  place,  and  the  problem 
can  be  attacked  in  a  very  efficient  fashion.  Although  this  mechanism  for  solution  is  not 
usually  taught  in  the  classroom,  it  seems  to  be  the  one  most  preferred  by  statisticians.  By 
concentrating  on  the  final  result,  the  entire  study  becomes  focused. 

With  respect  to  frequentism  or  Bayesianism,  the  components  of  the  statistical  solution 
remain  essentially  unchanged.  Of  course,  there  are  some  differences  in  the  approaches, 
with  the  major  difference  being  in  the  modeling  and  inference  stages.  However,  the  overall 
attack  is  similar  and  is  illustrated  in  the  next  section. 

3.  AN  EXAMPLE  CONCERNING  ICEBERGS 

Defant  (1961,  page  278)  presented  the  following  data  on  the  frequency  of  icebergs  off 
Newfoundland. 
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Table  1 :  Frequency  of  ic^rgs  off  Newfoundland  south  of  48°N 
(a)  and  south  of  the  Grand  Banks  (b)  for  the  period  1900-1926. 


Month 

Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec 

(a)  3  10  36  83  130  68  25  13  9  4  3  2  386 

(b)  0  1  4  9  18  13  3  2  1  0  0  0  51 


For  our  example,  we  will  look  at  the  question  of  whether  the  yearly  distribution  of 
icebergs  is  the  same  in  each  location.  A  glance  at  Figure  1  will  show  that  such  a  hypothesis 
is  very  likely,  but  for  illustration  we  will  step  through  both  a  Bayesian  and  a  frequentist 
approach  to  the  problem.  We  take  as  the  goal  of  our  study  to  be  the  description  of  the 
distribution  of  icebergs  off  Newfoundland. 

R«Mlv«  FnquMcy  of 

frequencies  of 


In  both  the  Bayesian  and  frequentist  approaches  to  this  problem  we  assume  that  the  data 
are  distributed  according  to  a  multinomial  distribution,  and  we  wish  to  test  the  null 
hypothesis  The  distributions  in  locations  (a)  and  (b)  are  the  same.  To  test  this  as  a 
frequentist  we  use  a  chi-squared  test  of  association  (see  Snedecor  and  Cochran,  1989). 
The  chi-squared  test  results  in  a  p-value  of 0.977,  which  is  very  strong  evidence  in  favor 
of  the  null  hypothesis. 

To  perform  a  Bayesian  analysis  a  prior  distribution  must  be  specified,  that  is,  a  distribution 
that  we  subjectively  believe  describes  the  pattern  of  icebergs.  We  then  use  this 
distribution,  in  conjunction  with  the  observed  data,  to  assess  the  plausibility  of  the 
hypothesis.  Since  we  really  have  no  prior  knowledge  about  the  icebergs,  we  use  a  strategy 
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that  attempts  to  model  this  ignorance  and  calculate  the  probability  of  every  data  table  with 
tne  given  marginal  totals,  using  a  hypergeometric  distribution.  This  leads  us  to  use  Fisher’s 
exact  test  (Fisher,  1970)  and  assess  the  probability  of  the  null  hypothesis  as  0.994  Again, 
this  is  very  strong  evidence  in  favor  of  this  hypothesis.  (Strictly  speaking,  Fisher’s  exact 
test  is  not  a  Bayesian  procedure  but  a  conditional  procedure,  as  it  is  calculated 
conditionally  on  the  observed  data.  However,  the  important  feature  is  that  it  yields  a 
conditional  inference.) 

We  now  can  clearly  see  the  distinction  between  Bayesian  and  frequentist  inferences.  The 
frequentist  bases  inference  on  a  frequency  interpretation.  A  forma)  conclusion  would  be  of 
the  form,  “the  statistical  procedure  used  (here  the  chi-squared  test)  would  result  in  an 
erroneous  inference  less  than  5%  of  the  time  in  repeated  experiments.”  In  contrast,  the 
Bayesian  inference  is  conditional  on  the  observed  data,  and  we  would  formally  conclude 
“based  on  the  stated  prior  distribution  and  observed  data,  the  probability  is  0.994  that  Hg 
is  true."  We  now  look  at  these  differences  more  closely. 

4  WHERE  DOES  THE  RANDOMNESS  COME  FROM? 

The  most  important  part  of  any  statistical  investigation  is  the  resulting  inference.  In  fact,  it 
may  even  be  said  that  the  main  reason  for  doing  a  statistical  investigation  is  to  produce  a 
meaningful  inference,  because  the  inference  applies  to  a  wider  population  than  is  actually 
studied  and  measured.  (For  example,  after  measuring  the  activities  of  a  number  of  waves 
in  a  certain  area,  we  are  then  interested  in  making  a  statement  (an  inference)  about  all 
waves  in  that  area.)  To  make  this  inference  we  need  an  underlying  model  of  the 
phenomena,  one  that  accounts  for  the  randomness  of  the  observations  and  allows  an 
inference.  Bayesians  and  frequentists  have  different  approaches  to  this. 

4.1  Frequency  Randomness 

The  frequentist  assumes  repeatability  of  the  experiment,  that  the  experiment  actually 
performed  is  one  of  an  infinitely  long  sequence  of  identical  experiments.  If  we  denote  this 
sequence  of  experiments  j,...,  then  we  make  our  inference  to  the  entire 

sequence,  even  though  only  one  experiment  (say  £*)  is  actually  performed.  The  rest  of  this 
imagined  sequence  builds  the  randomness  into  our  model.  We  know  that  the  results  of 
each  experiment  (if  performed)  would  be  slightly  different,  and  our  inference  will  take 
these  potential  differences  into  account. 

Thus,  the  frequentist  inference  is  an  unconditional  one  that  applies  to  the  entire  sequence 
and  does  not  single  out  the  experiment  actually  performed.  It  is  important  to  realize  that 
the  inference  is  about  the  performance  of  the  procedure  over  the  entire  sequence  of 
experiments,  such  as  “The  statistical  procedure  used  will  be  correct  in  95%  of  all 
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experiments  performed.”  The  actual  outcome  of  the  observed  experiment  will  not  change 
this  inference. 

4.2  Bayesian  Randonmess 

In  a  Bayesian  analysis  the  data  are  assumed  to  be  fixed,  and  inference  is  made  conditional 
on  their  observed  values.  Thus,  no  randomness  comes  from  the  data.  The  randomness  in  a 
Bayesian  inference  comes  from  the  subjective  prior  distribution.  This  randomness, 
together  with  the  information  in  the  data,  is  combined  into  the  posterior  distribution.  The 
posterior  distribution  is  then  used  for  inference.  Of  course,  different  subjective  prior 
distributions  may  result  in  different  inferences. 

More  precisely,  suppose  there  are  data,  X,  which  vary  according  to  a  probability 
distribution  /(jc|  6),  a  distribution  indexed  by  an  uaknown  parameter  6.  (For  example, 

/(•|  0)  may  be  a  Gaussian  distribution  with  unknowi:  mean  0.)  We  then  assume  that  the 
parameter  0  varies  according  to  a  prior  distribution  x(0).  This  probability  distribution 
reflects  our  knowledge  about  the  parameter  0  before  observing  the  new  data  x.  (In 
keeping  with  convention,  an  upper  case  X  denotes  an  unseen  random  variable  whereas  a 
lower  case  x  denotes  an  observed  value.  Thus  the  equation  “X  =  x”  means  that  we  have 
observed  the  value  x  of  the  random  variable  X.)  Using  the  laws  of  probability  (or 
sometimes  called  Bayes  rule)  we  calculate  the  posterior  distribution  of  0  given  X  =  x, 
^(0lx)  as 

1^,^  /(|e^ 

J/u|e)7t(e)<i0 

where  the  integral  is  over  all  values  of  0.  (For  more  detail  on  such  calculations,  see  Casella 
and  Berger,  1990.)  Our  inference  is  then  based  on  g(0|  x),  which  only  considers  the 
experiment  actually  performed,  not  any  repeated  sequence.  For  example,  one  might  infer 
“On  the  basis  of  the  specified  7c(0)  and  observed  x,  we  conclude  that  0  >  0  with  probability 

95%.”  This  inference  would  follow  if  it  were  the  case  that  J  g(0|  x)dQ  =  0.95 . 

0 

4.3  The  Appropriate  Inference 

As  mentioned  before,  the  puipose  of  this  paper  is  not  to  make  value  judgments  as  to 
whether  Bayesianism  or  frequentism  is  better.  Rather,  the  purpose  is  to  illustrate  situations 
where  one  method  is  more  appropriate.  It  then  follows  that  the  more  appropriate 
methodology,  and  inference,  is  the  one  to  use.  From  the  previous  two  subsections,  we  see 
that  the  frequentist  inference  is  more  appropriate  if  repeatability  is  important,  whereas  the 
Bayesian  inference  is  more  appropriate  if  the  inference  is  to  be  made  conditional  on  the 
observed  data.  Returning  to  the  iceberg  data,  it  seems  that  the  Bayesian  inference  is  more 
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appropriate,  as  we  are  faced  with  a  data  set  that  is  unrepeatable,  and  we  are  interested  in 
an  inference  conditional  on  that  data  set.  (Interestingly,  it  was  argued  during  discussions  at 
the  workshop  that  one  could  consider  the  observed  26-year  period  as  one  of  a  sequence  of 
26-year  periods,  in  which  case  the  frequentist  inference  maybe  more  appropriate.)  If  it  may 
be  argued  that  either  interpretation  is  valid,  and  hence  either  inference  is  appropriate,  there 
is  no  problem.  As  long  as  the  methodology  is  chosen  to  appropriately  answer  the  question 
of  interest,  phrased  in  the  manner  of  interest,  the  statistics  have  served  their  purpose. 

5  AN  EXAMPLE  CONCERNING  BREAKING  WAVES 

Hwang  et  al.  (1990)  report  on  an  experiment  concerning  average  height  of  breaking 
waves,  //g,  measured  as  a  function  of  RMS  surface  displacement,  r\.  The  data  are 
presented  in  Figure  2.  They  conclude  that  the  significant  wave  height,  where 

=  4ti,  and  state,  “In  a  random  wave  field,  waves  that  break  due  to  local  instabilities  are 
not  necessarily  the  highest  waves.”  Statistically,  we  can  think  of  this  as  testing  the 
hypotheses 


//(,:  Hg<4ri  vs  Ff,:  Hg>4r]. 


It  seems  here  that  frequency  considerations  are  important,  in  that  conclusions  should  apply 
to  repetitions  of  the  experiment.  This  concern  seems  implicit  in  the  above  quoted 
conclusion  of  Hwang  et  al.  Thus,  a  frequentist  analysis  is  more  appropriate.  Using  a 
standard  linear  regression  model  with  Gaussian  errors,  we  obtain  a  /?-value  of 0.999  for 
the  hypothesis  //^  <  4t|,  showing  that  there  is  overwhelming  evidence  to  support 


Breaking  Waves 


Figure!.  Averaged 
height  of  breaking  waves, 
Hg,  as  a  function  of  RMS 
water  siuface 
displacement,  ti.  The  line 
shown  is  the  least  squares 
line,  with  equation 
//^  =  0.102  +  2.89T1 
(r2=  0.994/ 
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this  hypothesis.  (In  fact,  the  hypothesis  H^<  3^  yields  a  p-value  of  0.91 1, 
demonstrating  extremely  good  support  for  this  even  stronger  claim.) 

Of  course,  a  Bayesian  analysis  could  also  be  performed,  but  the  inference  would  not  apply 
to  a  sequence  of  experiments.  The  conclusions  would  be  conditional  on  the  observed  data. 
To  do  the  Bayesian  analysis  we  again  use  a  standard  linear  regression  model  with 
Gaussian  errors,  but  we  also  assume  that  where  6  is  a  parameter  with  a  specified 

prior  distribution.  We  specify  the  prior  to  also  be  Gaussian,  and  we  take  the  prior  mean  to 
be  equal  to  the  hypothesized  value.  (Thus,  for  testing  we  ^)ecify  a 

Gaussian  prior  with  mean  4.  This  strategy  of  centering  the  prior  at  the  hypothesized  value 
gives  equal  prior  weight  above  and  below  the  value,  and  may  be  considered  an  impartial 
prior  specification.) 

Combining  our  prior  specification  with  the  observed  data,  we  calculate 
Pr(fc <  4|  data)  =  0.999  and  Pr(b  <  3|  data)  =  0.623.  That  is,  for  the  specified  priors  and 
conditional  on  the  observed  data,  b  is  less  than  4  with  probability  0.999  and  less  than  3 
with  probability  0.623.  Quantitatively,  these  conclusions  are  similar  to  those  of  the 
frequentist,  and  show  overwhelming  support  for  the  null  hypotheses.  The  only  difference 
is  in  the  scope  of  the  inference. 

Bayesian  conclusions  are,  of  course,  dependent  on  the  prior  specification,  and  sometimes 
there  might  be  concern  about  oversensitivity  to  this  specification.  Such  a  concern  is  easily 
addressed,  however,  by  calculating  posterior  probabilities  over  a  range  of  prior 
specifications.  This  is  illustrated  in  Figure  3,  where  we  display  the  posterior  probabilities 
over  a  wide  range  of  standard  deviations.  (The  standard  deviation  of  the  data  is  0.082,  and 
the  graph  shows  the  prior  standard  deviation  up  to  twice  this  value.)  The  figure  shows 
that,  for  this  range  of  prior  standard  deviations,  the  conclusions  from  the  Bayesian  analysis 
are  relatively  stable  in  their  support  of  Hq. 


236 


CASELLA 


Figures.  Posterior 
probabilities  for  the  null 
hypotheses  H(f.  b<A  (solid 
lines)  and  Hq-.  b  ^3  (dashed 
lines),  as  a  fiinctioii  of  the  prior 
standard  deviation. 


6.  AN  EXAMPLE  CONCERNING  BUBBLE  POPULATIONS 

The  distribution  of  bubble  populations  is  also  investigated  by  Hwang  et  al.  (1990).  They 
collected  data  on  bubble  populations  as  a  function  of  depth  and  wind  velocity,  as 
presented  in  Figure  4.  For  a  given  depth  Z  (cm)  and  wind  velocity  u  (m/s),  the  logarithm 
of  the  bubble  population  N(Z)  (log  cm^)  is  modeled  as 

N(Z)  =  a^+b„Z+€  «=10,11, ...,  15 

where  €  represents  random  error  and  is  assumed  to  have  Gaussian  distribution  with  mean 
0  and  variance 

A  question  of  interest  is  whether  the  distribution  of  bubbles  is  the  same  at  each  depth. 

After  some  thought,  it  seems  that  the  appropriate  inference  here  is  the  frequentist 
inference.  Concern  about  the  repeatability  of  the  inference  leads  to  this  conclusion,  as  we 
would  like  to  be  able  to  describe  the  bubble  populations  at  a  given  depth  and  wind  velocity 
when  such  conditions  are  again  realized. 

6. 1  A  Standard  Frequentist  Inference 

A  standard  approach  to  this  problem  is  to  decide  if  the  slopes  are  the  same  at  each  wind 
velocity,  so  we  would  test  the  null  hypothesis  b^Q=l\^=-"  =  b^^.  Doing  so  leads  to  a 
p-value  of 0.063,  which  suggests  rejection  of  Hq.  Thus  a  standard  frequentist  analysis 
would  lead  us  to  fit  separate  regression  lines  for  each  wind  velocity.  So  for  each  wind 
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BubbI*  PopulatioM 


Figure  4.  The  groiq>s  are 
in  order  from  lowest  to 
highest  wind  velocity 
and  are  denoted  sdid 
squares  (10  m/s),  open 
squares  (11  m/s),  solid 
diamonds  (12  m/s),  open 
diamonds  (13  m/s),  s^d 
triangles  (14  m/s)  and 
open  triangles  (IS  m/s). 
llie  data  are  connected 
merely  to  aid  viewing. 
Data  from  Hwang  et  al. 
(1990). 


Figure  5.  Staitdard 
fiequentist  (solid  lines) 
and  empirical  Bayes 
(dashed  lines)  fits  to  the 
bubble  data,  coded  as  in 
Figure  4.  The  empirical 
Bayes  lines  (whose 
slopes  are  pulled  toward 
-0.048)  are  under  the 
least  squares  lines  for  1 1. 
14,  and  15  m/s,  and 
above  the  least  squares 
lines  for  10  and  12  m/s. 
The  lines  are  virtually 
identical  for  13  m/s. 


velocity  we  would  use  a  separate  regression  equation  to  predict  the  bubble  population. 
See  Ta^le  2  and  Figure  5. 

6.2  An  Empirical  Bayes  Analysis 

The  bubble  population  data  are  ideal  for  an  empirical  Bayes  analysis — ^a  mixture  of 
frequency  and  Bayesian  analyses  that  combines  the  best  features  of  each.  Here  we  will 
only  briefly  explain  the  methodology,  for  a  more  detailed  introduction  see  the  articles  by 
Casella(1985, 1992). 

To  perform  an  empirical  Bayes  analysis  we  start  with  the  frequentist  model  and  inference 
structure.  We  append  a  Bayes  model  to  the  slopes 
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b^  -  Gaussian  (i>,x^),u=  10,11,-  *,15, 

that  is,  that  the  slopes  come  from  a  common  Gaussian  population  with  unknown  mean  b 
and  variance 

The  “empirical”  part  of  empirical  Bayesian  is  to  now  estimate  these  unknown  parameters  b 
and  x^  from  the  data.  (A  standard  Bayesian  analysis  would  specify  values  for  these 
parameters.)  Using  these  estimated  values  allows  the  data  to  assess  the  tenability  of  the 
submodel,  that  the  b^'s  come  from  a  common  population.  The  empirical  Bayes  slope 
estimates  are  a  convex  combination  of  the  conunon  overall  slope  (-0.048)  and  the 
individual  least  squares  slopes,  given  by 

empirical  Bayes  slope  =  (0.221)  (-0.048)  +  (0.0779)  (least  squares  slope). 

The  weighting  factors  0.221  (and  0.779  =  1  -  0.221)  are  data  based  estimates.  The 
empirical  Bayes  slope  estimates  are  valid  under  the  model  of  frequentist  repeatability.  In 
fact,  they  are  superior  to  the  frequentist  estimates  using  a  criterion  of  expected  mean 
squared  error.  Thus,  on  the  average,  the  empirical  Bayes  estimates  will  be  closer  to  the 
true  values  than  the  standard  frequentist  estimates.  They  combine  the  best  features  of 
Bayesian  modeling  and  frequentist  inference. 

Figure  5  also  shows  the  empirical  Bayes  regression  lines.  Although  they  are  not  very 
different  from  the  standard  frequentist  lines,  they  do  display  a  movement  toward  the 
common  slope  value.  The  empirical  Bayes  analysis  has  uncovered  a  small  amount  of 
common  structure  and  has  used  this  in  improving  each  of  the  estimates. 


Table  2.  Coefficients  for  the  standard  regression  analysis 
(frequentist)  and  empirical  Bayes  analyses  of  the  bubble 
populations. 


Wind  empirical 

Velocity  n _ intercept _ slope _ std.  dev.  Bayes  slope 


4 

0.666 

-0.084 

11 

5 

0.924 

-0.040 

12 

4 

1.594 

-0.080 

13 

5 

1.669 

-0.050 

14 

4 

1.698 

-0.031 

15 

4 

1.635 

-0.0009 

-0.076 

-0.042 

0.008 

-0.073 

0.017 

-0.050 

0.029 

-0.035 

0.027 

-0.011 
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7  CONCLUSIONS 

The  statistical  methodology  to  be  used,  whether  Bayesian  or  frequentist,  should  be 
selected  according  to  the  type  of  inference  that  is  desired  (and  is  appropriate).  The 
frequentist  methodology  is  appropriate  for  inference  over  a  series  of  repeated  expoiments, 
while  the  Bayesian  methodology  is  appropriate  for  inference  ^[)ecific  to  the  experiment 
that  was  done.  This  article  has  given  examples  and  provided  discussion  of  situations  where 
each  methodology  is  appropriate. 

There  is  no  brick  wall  between  Bayesianism  and  frequentism.  The  methodologies  are  not 
at  odds  with  one  another;  they  are  complementary  to  one  another.  When  approaching  a 
statistical  problem  “opportunism”  is  best.  With  that  in  mind,  the  appropriate  analysis  and 
inference  can  be  chosen  from  all  available  statistical  methodologies. 

Both  Bayesianism  and  frequentism  are  built  on  a  set  of  assumptions,  some  more  palatable 
than  others.  For  a  user  of  frequentist  methods,  perhaps  the  assumption  most  difficult  to 
believe  is  that  the  process  (including  parameter  values)  remains  constant  over  the  imagined 
series  of  experiments.  For  a  user  of  Bayesian  methods,  perh^s  the  assumption  most 
difficult  to  believe  is  that  the  prior  distribution  is  correct.  These  assumptions,  however, 
can  sometimes  be  checked  and  and  maybe  even  relaxed.  Moreover,  their  reasonableness  in 
any  particular  situation  may  also  form  a  basis  for  choo»ng  an  appropriate  methodology. 
[See  Berger's  (1985)  discussion  of  robust  Bayesian  analysis,  which  addresses  these 
concerns].  Lastly,  there  is  an  enormous  amount  of  research  being  done  in  statistics,  and 
some  of  it  is  aimed  at  relaxing  these  assumptions.  Such  research  has  already  given  us 
techniques  like  empirical  Bayes  analysis,  a  synthesis  of  both  Bayesian  and  frequentist 
methodologies  which  can  often  provide  superior  solutions. 

This  paper  is  technical  report  BU-1 187-M,  in  the  Biometrics  Unit,  Cornell  Univeraty. 

This  research  was  supported  by  National  Science  Foundation  Grant  No.  DMS9 100839 
and  National  Security  Agency  Grant  No.  90F-073. 
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BAYESIAN  METHODS:  AN  INTRODUCTION 
FOR  PHYSICAL  OCEANOGRAPHERS 
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“You  could  not  stq>  twice  into  the  same  river,  for  new  waters  are 
ever  flowing  on  to  you,”  Heraclitus,  as  quoted  in  Bartlett  (1980). 


ABSTRACT 

The  Bayesian  approach  to  statistics  is  a  conceptually  ^ple  method  of  treating 
uncertainty.  It  involves  modeling  uncertainty  with  probability,  and  cmiditioning  on  such 
data  as  become  available.  Because  of  its  flexibility,  there  are  many  styles  of  application. 
Using  the  same  examples  as  George  Casella's  paper  in  this  volunw,  I  discuss  Imw  this 
Bayesian  method  approaches  such  problems. 

1  A  GENERAL  INTRODUCTION  TO  BAYESIAN  IDEAS 

Most  statistical  analyses  begin  with  some  data,  denoted  x,  and  a  parameter,  denoted  6. 
These  may  be  discrete  or  continuous,  and  may  have  vector,  matrix,  or  more  complex 
structures.  For  the  purposes  of  this  section,  the  nature  of  x  and  0  do  not  matter,  ^t  in 
application  they  are  very  important. 

The  mechanism  generating  the  data  is  called  the  likelihood  function,  and  is  written 
/(x|  0) .  Here /may  be  a  probability  mass  function,  if  x  is  discrete,  or  a  pr(4)ability  doisity, 
if  X  is  continuous.  In  both  cases  it  describes  the  probabilistic  bdiavior  of  the  data,  x,  for  a 
fixed  value  of  the  parameter  0.  The  second  part  of  a  statistical  model  is  a  prior  distribution 
ic(0),  which  again  might  be  a  probability  mass  function  if  d  is  discrete,  or  a  probability 
den«ty  if  0  is  continuous. 

These  two  ingredients  determine  the  joim  distribution  of  x  and  0  as  follows; 

MJt.0)  =  /(x|0)n(0)  (1) 

Once  the  data  x  are  observed,  the  laws  of  probability  prescribe  how  the  conditional 
distribution  of  0  given  x  is  to  be  calculated: 
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^  ~  p(x)  f  h(x.d)de  f  f(x,d)x(e)dd 

Ja  Ja 


The  distribution  is  called  the  posterior  distribution  of  d,  because  it  is  the  distribution  of  6 
after  having  observed  x.  Thus  the  import  of  the  data  is  to  change  the  distribution  of  0  from 
the  prior,  n(0),  to  the  posterior,  g(d  |  jr).  Everything  in  this  paper  is  a  discussion  or  an 
application  of  these  ideas. 

The  essential  idea  here  is  the  use  of  probability  to  express  uncertainty.  Having  decided  to 
do  that,  formula  (2)  follows  from  formula  (1)  by  very  simple  and  non-controversial  steps. 

One  important  matter  is  the  interpretation  given  to  probability  here;  whose  probabilities 
are  these?  Although  there  are  some  Bayesians  who  would  give  other  answers,  the 
dominant  answer  now,  (and  the  one  to  which  I  subscribe)  is  that  these  probabilities  are 
subjective,  and  reflect  the  opinion  of  the  writer,  or  opinions  the  writer  believes  others 
hold.  Bayesians  do  not  come  to  this  view  gladly.  We  wish  there  were  a  way  to  guarantee 
that  the  equations  written  capture  the  objective  truth,  but  such  guarantees  do  not  seem 
possible.  We  observe  in  science  disagreements  in  which  none  of  the  sides  has  made  a 
provable,  mathematical  error.  The  progress  of  a  science  might  then  be  thought  of  as  the 
development  of  informed  opinion  on  a  subject. 

The  name  “Bayesian”  incidentally,  is  in  honor  of  Rev.  Thomas  Bayes,  an  eighteenth 
century  English  minister  and  “natural  philosopher.”  He  found  the  principle  now  embedded 
in  (2),  and  hence  this  way  of  thinking  about  and  doing  statistics  is  named  for  him. 

Finally,  note  that  the  quantities  x  and  0  are  simply  random  variables  with  some  joint 
distribution,  although  one  is  written  with  a  Roman  letter  and  one  with  a  Greek  letter.  If 
one  began  with  the  joint  distribution  hix,9)  and  learned  0,  the  posterior  on  x  given  0 
would  be  fix\  0),  and  would  represent  what  was  known  about  x  after  6  had  been  learned. 
Thus  the  model  is  symmetric  in  x  and  0,  although  to  encourage  intuition  it  is  customary  to 
think  of  the  former  as  data  and  the  latter  as  a  parameter. 

In  the  remainder  of  this  paper,  I  discuss  Casella's  iceberg  example  in  section  2,  breaking 
waves  in  section  3  and  bubble  data  in  section  4.  In  section  5, 1  give  my  views  on 
firequentism  and  the  possibility  of  compromises  between  Bayesian  and  fi-equentist  ideas. 
Finally  in  section  6 1  give  my  conclusions. 

2.  THE  ICEBERG  DATA 

Before  discussing  the  elements  of  a  model,  I  think  it  is  most  useful  to  get  the  question 
straight,  which  corresponds  most  closely  to  Casella's  steps  5  and  6. 
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Everyone  with  a  modicum  of  liberal  arts  training  knows  about  “compare  and  contrast” 
questions.  The  point  is  that  there  are  always  similarities  and  always  differences.  Either  can 
be  celebrated. 

Looking  at  the  graph  of  relative  frequencies  of  icebergs,  it  is  clear  that  most  of  the  story 
here  is  in  the  similarity  of  the  patterns.  But  one  could  also  look  for  differences,  for  the 
“contrast.”  If  you  ask  me  to  believe  that  the  frequencies,  month-by-month,  of  icebergs  are 
exactly  the  same  to  an  arbitrary  number  of  decimal  places,  I  must  reply  that  I  cannot.  Thus 
I  regard  Casella's  null  hypothesis  as  foolishness.  I  put  zero  prior,  and  hence  zero  posterior, 
on  its  truth.  So  I  need  a  better  question. 

Suppose  instead  I  ask  what  I  consider  to  be  a  better  question:  how  far  apart  are 
6^  =  {6^  ,...,6^2^,  frequencies  of  icebergs  south  of  48°N,  and  6^  =  (^,...,©^2) 
frequencies  of  icebergs  south  of  Grand  Banks?  I  could  measure  this  in  a  variety  of  ways, 
such  as 

is) 

and 

Now  a  prior  on  and  a  likelihood  on  counts  given  (9^,0")  will  yield  a  posterior, 

and  I  can  compare  what  I  thought  about  a  distance  measure  before  I  saw  the  data  with 
what  I  think  after  I  see  the  data. 

This  is  a  measure  of  what  I  have  learned  from  the  data  about  how  different  0^  and  0''  are. 
So  this  is  how  I  think  a  modem  Bayesian  would  structure  the  problem. 

What  are  the  data?  If  they  are  a  complete  census  of  all  icebergs  from  1900  to  1926,  then 
we  know  that  the  hypothesis  Hq  is  false.  So  suppose  that  these  are  a  random  sample  of  a 
larger  population  of  icebergs.  How  do  these  particular  icebergs  come  to  be  in  the  data  set? 
Because  someone  observed  them,  presumably.  Is  it  reasonable  to  assume  that  icebergs 
have  the  same  chance  of  being  observed,  regardless  of  month?  I  should  think  that  the 
summer  months  are  easier  to  observe  than  the  winter  months,  because  more  observers  will 
be  around  and  weather  conditions  are  better.  The  critical  issue  is  whether  the  observation 
bias,  I  should  believe,  is  the  same  for  the  two  areas.  Thus  if  is  the  probability  of  a 
random  iceberg  in  region  W  being  there  in  month  1,  and  tj^  is  the  probability  of  its  being 
observed,  then  i]f'0f'  is  the  probability  of  an  iceberg  being  there  and  being  observed,  and 
the  frequencies  observed  have  probabilities 
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Note  that  if  I  believe  that  icdwrg  generation  is  constant  by  month  (^=...9(^  =  1/12), 
then  gives  information  about  the  observation  intensities.  Which  interpretation  to 
give  to  the  data  depends  on  what  you  believe.  The  Bayeaan  method  cant  say  which  is 
right  or  wrong,  but  it  does  provide  for  (and  insist  on  having)  a  full,  probabilistic  statement 
of  what  the  investigator  believes.  Reasonable  people  need  not  agree  on  these  matters. 

This  allows  readers  to  judge  those  beliefs,  and  possibly  approximate  the  calculations  the 
reader  might  do  with  his  own  beliefs.  The  argument  affects  the  likelihood  as  well  as  the 
prior,  both  are  subjective.  Note  that  I  now  have  more  parameters  than  I  have  data  points. 
Hence  a  frequentist  treatment  of  such  a  model  is  impossible.  Frequentist  analysis  thus 
encourages  you  not  to  delve  too  deeply,  not  to  ask  such  questions. 

Even  the  above  formulation  is  too  simplistic,  since  it  assumes  that  the  probabilities  B  and  rj 
are  constant  over  years.  Since  during  the  period  of  the  data  collection  both  the  sinking  of 
the  Titanic  and  World  War  I  occurred,  it  is  hard  to  believe  that  q,  the  observation 
probabilities,  were  constant.  A  careful  modeling  of  the  data  would  have  to  take  this  into 
account  and  would  treat  skeptically  claims  of  a  vast  increase  in  icebergs  in  the  latter  half  of 
the  period. 

Priors  on  0  are  important  for  the  inference  in  question.  The  first  tool  a  statistician  would 
think  of  in  this  regard  is  a  Dirichlet  distribution  (a  multivariate  Beta  distribution)  on  the 
vector  (0i,  ..,9]2).  However  the  Dirichlet  has  some  unattractive  features  for  this  purpose, 
principally  that  it  treats  all  the  months  symmetrically,  without  making  use  of  the  adjacency 
of  them.  I  would  prefer  to  think  of  a  continuous  model  in  which  the  critical  parameter  is 
an  angle,  which  could  be  given  a  Fisher/von  Mises  distribution,  which  is  like  a  normal  (or 
Gaussian)  distribution  for  angles  and  has  as  hyperparameters  a  central  tendency,  v,  and  a 
measure  of  spread,  Thus  v  would  indicate  the  direction  of  greatest  iceberg  intensity, 
thinking  of  time  through  the  year  as  circular.  Looking  at  the  data,  periiaps  a  good  estimate 
would  be  V  =  May  10.  The  measure  of  spread,  t^,  would  indicate  how  peaked  the 
distribution  is.  To  complete  the  model,  a  prior  on  both  v’s  (North  and  South),  and 
both  t’s  would  be  necessary. 


In  these  terms,  I  think  that  the  quantity  =  would  be  useful  as  an  alternative  to 

dj  and  rilj.  The  main  advantage  of  dj  is  that  its  units  are  days,  which  is  natural  and  might 
have  a  physical  interpretation. 

It  is  now  time  to  turn  attention  to  inference,  in  Casella's  sense.  The  frequentist  statement, 
applied  to  this  situation,  is  that  if  the  null  hypothesis  of  no  difference  were  true,  and  the 
experiment  were  repeated  an  infinite  number  of  times  with  the  same  parameter  values,  the 
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data  would  be  as  or  more  extreme  in  only  I  -  0.977  =  0.023  proportion  of  the  cases.  Thus 
the  conclusion  is  that  either  the  null  hypothesis  is  false  or  something  unusual  has 
happened.  But  frequentists  caimot  say  which,  or  even  give  a  probability  on  which.  Note 
that  0.023  is  NOT  the  probability  that  the  null  hypothesis  is  false.  Not  believing  //q,  I 
don't  find  this  frequentist  probability  calculation  useful. 

Casella's  version  of  a  Bayesian  treatment  of  this  problem  is  not  recognizably  Bayesian  to 
me.  All  it  does  is  condition  on  both  margins  in  the  table  (total  icebergs  observed  by  month, 
and  total  icebergs  observed  N  and  S),  and  then  calculates  a  frequentist  p-value.  The  only 
warranted  statement  from  his  calculation  is  again  that  either  something  unusual  happened 
(with  probability  less  than  0.006),  or  the  null  hypothesis  is  false.  But  again  he  cannot  say 
which,  nor  give  a  probability  for  it.  I  see  no  justification  for  Casella's  statement  that  “the 
probability  of  the  null  hypothesis  is  0.994.” 

One  interesting  way  to  think  about  these  statistical  procedures  is  to  ask  what  happens  as 
the  sample  size  grows  large.  In  frequentist  statistics,  no  sharp  null  hypothesis  (such  as 
0^  =  0^)  is  significant  if  the  sample  size  is  small.  However  as  the  sample  size  grows  large, 
every  such  hypothesis  will  turn  out  to  be  significant.  Thus  significance  measures  sample 
size  more  powerfully  than  it  does  the  extent  to  which  the  “straw-man”  null  hypothesis  is 
false.  Since  better  measures  of  sample  size  are  generally  available,  significance  testing  is,  in 
my  judgment,  not  very  useful. 

By  contrast,  in  the  Bayesian  analyses  I  have  been  discussing,  as  the  sample  size  grows,  the 
posterior  distribution  of  whichever  d  you  like  will  converge  to  a  point.  You  will  then 
effectively  know  how  far  from  true  the  hypothesis  of  equality  is,  by  your  chosen  measure. 
What  to  make  of  it  then  depends  on  what  you  are  doing  scientifically,  whether  you  want  to 
emphasize  the  “compare”  or  the  “contrast”  side. 

I  have  written  at  some  length  about  the  iceberg  data  because  it  gives  me  an  opportunity  to 
illustrate  how  Bayesian  thinking  helps  me  to  model  a  process.  The  important  points,  in  my 
view,  are 

•  The  frequentist  hypothetical  infinite  sequence  of  identical  circumstances  is  a  figment 
of  their  imaginations. 

••  Priors  and  likelihoods  are  important  because  they  correspond  to  something  real; 
what  you  believe  about  the  data. 

•••  Frequentist  ideas  can  get  in  the  way  of  good  modeling  because  you  can  easily  get 
too  many  parameters. 

••••  Testing  sharp  null  hypotheses  is  generally  a  foolish  undertaking,  because  they  are 
each,  to  a  greater  or  lesser  degree,  wrong. 
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3  BREAKING  WAVES 

The  principal  difference  between  this  example  and  the  previous  one  is  that  the  null 
hypothesis  is  no  longer  sharp.  That  is,  inferential  attention  focuses  on  a  single  parameter 
h,  and  whether  b  <4  or  b  <2. 

Unlike  Casella,  I  would  not  center  the  prior  at  the  hypothesized  value,  but  would  instead 
have  it  represent  my  honest  opinion,  or  my  view  of  what  some  other  scientific  opinion 
might  honestly  be.  My  summary  would  be  the  posterior  distribution  on  the  parameter  b, 
from  which  one  could  calculate  P{b  <  4j  data),  P{b  ^  3|  data),  and  any  other  probabilities 
that  might  be  of  interest. 

4.  BUBBLE  DATA 

This  is  similar  to  the  breaking  wave  data,  except  that  there  are  several  regressions  instead 
of  a  single  one.  Such  a  model  is  called  hierarchical.  These  have  proven  useful  in  a  very 
wide  variety  of  domains. 

At  the  first  level,  the  log  bubble  population  is  modeled  as 

V(Z)  =  +  € 

where  €.  is  Gaussian  with  mean  0  and  variance  N{Z)  and  Z  are  observed,  and  a„,  b^ 
and  are  parameters.  At  the  second  level,  there  might  be  a  bivariate  Gaussian 
distribution  on  (a„,  bj  with  some  mean  {a,b)  and  some  covariance  matrix  Z.  Finally,  a 
third  level  would  specify  a  prior  in  (c’,A,Z,<7-2).  Such  a  model  is  complete  if  each  quantity 
mentioned  has  a  distribution.  A  complete  model  permits  a  Bayesian  analysis,  conditioning 
on  the  observed  data,  as  a  Bayesian  should.  Interest  may  focus  on  the  parameters  at  any 
level.  (a„,  A„)  might  be  of  interest,  or  {a,b),  or  any  of  the  others. 

5  ON  COMPROMISES 

As  explained  just  above,  a  complete  hierarchical  model  is  fully  Bayesian,  and  not  a 
compromise.  “Empirical  Bayesian  models”  are  incomplete;  they  forget  the  upper  levels  of 
a  hierarchy  and  treat  the  remaining  parameters  frequentistically.  There  is  no  advantage  to  a 
Bayesian  in  doing  so.  If  the  posterior  distribution  is  peaked  in  the  parameters  taken  to  be 
fixed,  there  may  not  be  too  much  loss  in  this  method  as  an  approximation.  However  in 
great  generality  estimates  of  uncertainty  derived  from  the  “empirical  Bayesian  method” 
will  be  underestimates  of  the  same  measure  derived  from  a  fully  Bayesian  approach, 
because  parameters  are  taken  as  known  with  certainty  that  are  not  known  with  certainty. 
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To  be  successful,  a  compromise  must  offer  something  to  each  party  Empirical  Bayes 
methods  do  represent  a  compromise  on  the  frequentist  side,  because  some  (but  not  all) 
parameters  are  treated  as  random  variables  with  distributions.  But  to  a  Bayesian,  this 
“compromise”  offers  no  advantages  over  a  straight  Bayesian  analysis. 

6.  PRAGMATIC  CONCLUSIONS 

In  principle,  I  am  convinced  that  Bayesian  ideas  are  the  right  way  to  structure  thinking 
about  inference.  We  are  still  learning  how  to  use  this  powerful  tool  in  an  effective  way.  If 
the  problem  you  have  can't  be  done  now  in  a  Bayesian  way,  then  you  have  to  work  your 
problem  as  best  you  can,  approximating  a  fully  Bayesian  analysis. 

Even  the  pre-Socratic  philosopher  Heraclitus  understood  that  frequentism  does  not  apply 
to  oceanographic  problems. 

Research  supported  by  NSF  SES-8900025  and  DMS-9005858,  and  OMR  N00014-89-J-1851. 
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A  BAYESIAN  APPROACH  TO  OBSERVATION  QUALITY 
CONTROL  IN  VARIATIONAL  AND 
STATISTICAL  ASSIMILATION 


Andrew  C.  Lorenc 

Forecasting  Research,  Meteorological  Office,  Bracknell,  England 
1.  INTRODUCTION 

Bayesian  methods  are  ideally  suited  to  the  ongoing  operational  data  assimilation  needed 
for  numerical  weather  prediction  (NWP).  Observational  errors  can  be  treated  as  random 
variables,  and  we  have  a  long  experience  of  previous  observations  over  which  to  build  up 
an  estimate  of  their  distribution.  This  experience  tells  us  that  observation  error 
distributions  are  typically  non-Gaussian;  there  are  more  large  errors  than  expected.  It  is 
the  handling  of  these  gross  errors  that  we  call  quality  control.  As  well  as  the  observations, 
we  also  need,  and  have,  much  other  information  about  the  atmosphere.  Indeed  this  prior 
information  is  more  valuable  than  that  from  the  observations  at  any  one  time.  We  have  a 
forecast  “background  field,”  based  on  the  accumulated  knowledge  from  previous 
observations,  which  is  usually  rather  accurate.  A  forecast  based  on  the  background,  with 
no  new  observations,  would  probably  be  more  accurate  than  one  based  on  a  batch  of 
observations,  with  no  background.  So  it  is  essential  to  give  proper  weight  to  this  prior 
knowledge;  the  Bayesian  approach  allows  us  to  do  this. 

In  section  2  we  review  the  Bayesian  derivation  of  the  posterior  probability  of  atmospheric 
states,  and  hence  the  equation  used  to  combine  observations  and  background  to  produce 
an  "analysis"*  for  NWP.  With  Gaussian  distributions,  the  posterior  distribution  has  mean 
and  variance  given  by  equations  which  are  often  derived  by  a  statistical  approach,  referred 
to  as  optimal  interpolation  (01).  For  NWP  we  need  to  find  the  “best”  analysis,  without 
necessarily  evaluating  the  complete  posterior  probability  density  function  (p.d.f ).  This  can 
be  done  by  a  variational  approach,  which  for  Gaussian  errors  is  shown  to  be  equivalent  to 
01.  With  non-Gaussian  errors,  we  have  to  be  more  careful  in  defining  “best.”  Appropriate 
definitions  and  their  interpretation  for  multi-modal  p.d.f  s  are  discussed. 

In  section  3  we  introduce  a  simple  model  of  observational  errors  as  the  sum  of  a  no¬ 
information  distribution  of  gross  errors  and  a  Gaussian  distribution  of  good  data.  Despite 
its  simplicity,  this  distribution  has  been  found  to  be  sufficient  to  derive  an  effective  quality 
control  scheme  for  the  majority  of  observations.  The  gross  errors  leads  to  a  posterior 
p.d.f  which  nmy  be  multi-modal.  Variational  methods  using  a  descent  algorithm  are  not 
guaranteed  to  find  the  best  analysis. 

*  This  terminology  is  traditional  in  NWP.  “Synthesis”  would  be  better. 


249 


250 


LORENC 


The  traditional  approach  to  dealing  with  gross  errors  is  to  apply  a  quality  control 
procedure  to  reject  “bad”  observations,  then  to  perform  the  analysis  with  the  remaining 
observations,  assuming  they  have  Gaussian  errors.  In  section  4  we  provide  a  Bayesian 
justification  of  criteria  for  doing  this.  We  derive  an  expression  for  the  posterior  probability 
of  gross  error  and  reject  a  datum  based  on  this.  (A  similar,  but  not  identical,  probability  is 
implicit  in  variational  descent  algorithms).  The  posterior  probability  can  be  evaluated  for 
gross  errors  in  each  observation — individual  quality  control  (IQC),  or  for  each 
combination  of  gross  errors — simultaneous  quality  control  (SQC).  The  operational  quality 
control  procedure  at  the  Met  Office  is  based  on  IQC,  while  that  at  the  European  Centre 
for  Medium-Range  Weather  Forcasting  (ECMWF)  is  based  on  SQC.  The  approaches 
differ  subtly  in  the  assumptions  made  about  the  posterior  p.d.f  when  defining  the  “best” 
analysis.  More  significantly,  they  differ  in  the  further  approximations  which  have  to  be 
made  in  a  practical  implementation.  In  section  5,  a  simple  example  is  studied  illustrating 
the  differences  between  the  variational  method,  IQC,  and  SQC. 

2  BAYESIAN  DERIVATION  OF  ANALYSIS  EQUATION 

This  derivation  mainly  follows  Lorenc  (1986). 

2. 1  Notation 

X  atmosphere  as  represented  in  model 
xt  model  representation  of  the  true  state  of  the  atmosphere 
Xb  prior  estimate  of  x,  (e  g.,  from  forecast) 
y  observations 

yt  observations  that  would  be  given  by  error-free  instruments 
K{x)  forward  operator  for  calculating  y  from  x 

K  tangent  linear  operator  of  K,  such  that  Ar(x+5x)=Ar(x)+K5dx+0(6x2) 

P  probability 

p  probability  distribution  function 

P(x)  =  probability  that  x^<x+dx 
= /?(x)dx 

N.B.  We  use  x  both  for  the  vector  of  values  and  for  the  event  x^<x+dx. 

P(A|B)  is  the  conditional  probability  of  A,  given  B. 

2.2  Probability  equations 

Probabilities  are  used  in  a  Bayesian  way  to  describe  the  state  of  information.  We  have 
some  prior  information  about  x.  We  add  to  this  information  from  observations  y.  We  need 
to  know  the  posterior  knowledge  about  x.  Operator  K  does  not  have  a  normal  inverse. 
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From  now  on  all  probabilities  are  conditional  on  knowing  x^.  To  simplify  notation  we 
write  P(  • )  instead  of  P(  •  j  x^). 

The  basis  of  the  derivation  is  the  identity 

P(xny)  =  P(x  |y)  P(y)  =  P(y|x)  P(x) 

=  /j(x|y)dx  p(y)dy  =  p(y|x)dyp(x)dx. 

What  we  want  an  expression  for  is 

P(*ly)  =/K»|y)d*.  the  analysis  probability,  i.e.,  the  probability  that  x^<x+dx, 
given  the  background  x,,  and  the  observations  y. 

We  assume  we  know  certain  distributions,  based  on  our  prior  experience  and  our 
knowledge  of  the  physics; 

P(x)  =p(x)dx,  is  the  probability  that  x^<x+dx,  given  only  the  prior  knowledge 
ofXb 

p(y|  y,  nx)  is  the  instrumental  error  distribution. 
p(y,  I  x)  is  the  forward  operator  error  distribution. 

From  the  last  two  distributions,  we  can  find 

P(y|x)=/7(y|x)dy,  the  probability  of  getting  observations  y  given  x=x,. 

p(y|x)  =  Jp(y|y,'^x)p(yt|x)dy,.  (2) 

From  this,  and  our  prior  knowledge  of  x,  we  can  find 

P(y)  =p(y)dy,  the  probability  of  getting  observations  y. 

p(y)=  fp(y|x)p(x)dx 

(3) 

=J|p(y|y  ^nx) p{y^\\)6y^  p(x)dx. 


Bayes’  Theorem,  which  follows  from  the  basic  identity  (1),  is 

P(x|y)  =p(y|x)  p(x)/p(y). 


(4) 
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We  can  substitute  the  expressions  derived  above  to  give 

.  I  X  Jp(y|y.'^x)p(y.|x)dy.  p(x) 
p(x!y)  =7^ — j-i - ; - .  (5) 

JJ  p(y\ y.  p(y.  i  x)<*y.  /Kx)dx 

This  p.d.f.  describes  our  total  posterior  information  about  x,  given  Xf,  and  y. 

2.3  Solution  using  Gaussian  probability  distributions 
We  assume  K  can  be  linearized  in  the  region  of  x,,  and  x,  such  that 

A^(x,)  =  Kix^)+Kix^  -  Xb).  (6) 

We  assume  all  the  p.d.f  s  are  Gaussian,  and  use  the  notation 

N(x|  in,B)  =  ((2>rf  I B  Ir^'^expC-icx  -  in/B'‘(x  -  m))  (7) 

where  B  is  an  NxN  positive  definite  matrix,  and  |B|  is  its  determinant. 

We  assume  that  we  know 

the  background  error  distribution  p(x)  =  Nfxlx^.B), 

the  instrumental  error  distribution  />(y|  y,  nx)  =  N(y|  y,,0), 

the  forward  operator  error  distribution  ftCytl*)  =  N(y,IAr(x),F), 

where  B,  O,  and  F  are  covariances. 

Then,  using  the  properties  of  Gaussians,  the  observational  error  distribution  is  given  by  the 
convolution 

p(y\x)  =Jp(y|y,ox)p(y.|x)dy, 

=  N(y|A:(x,),0+F) 

where  0+F  (=E)  is  the  observational  error  covariance. 

The  observation  distribution,  only  knowing  x^,  is  given  by 
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p(y)  =  N(yli:(x^).0+F+ KBK^).  (9) 

Substituting  these  into  Bayes’  Theorem  (4)  ^ves 

P(x|  y)  =  N(y  |^:(x.).0+ F)  N(x|  x^.B)/ N(y|  Kix^,).0+¥+  KBK"^) 

=  N(x|x„A).  (10) 

where  x,  and  A  are  defined  by 

A  =  B-BK^(KBK^+0+F)“'KB 

X.  =  Xb  +  BK'^fKBK^ +0+ F)-‘(y  -  A-fx^)). 


It  is  normal  to  assume  that  the  “best”  estimate  of  x,  is  ^ven  by  the  mean  x,  of  the 
Gaussian  posterior  distribution.  Thus  using  the  above  equation  we  can  calculate  x, 
directly.  Equation  (1 1)  is  the  “OF’  equation,  often  derived  as  the  minimum  variance  best 
estimate,  without  relating  it  to  the  p.d.f  (10). 

2.4  Non-Gaussian  Bayesian  analysis 

UK  is  more  nonlinear,  or  the  p.d.f  s  are  non^Gaussian,  then  (10)  and  (11),  capable  of 
direct  solution,  cannot  be  used.  Although  the  Bayes’  Theorem  (4)  for  the  analysis  p.d.f  is 
still  valid,  the  expression  for  p  which  results  is  usually  too  complicated  to  be  very  useful  in 
describing  our  knowledge  about  x;  we  want  an  estimate  of  the  “best”  x,  without 
evaluating  the  full  p.d.f  First,  to  define  “best,”  we  define  a  loss  Junction  L(x,,x)  giving  the 
cost  to  us  of  making  an  estimate  x,  when  the  true  value  is  x.  The  expected  loss  R  is 

/?(x,)  =  jL(x,,x);7(x|y)dx.  (12) 

The  best  estimate  is  the  x,  which  minimizes  this.  In  general  this  requires  evaluating  all  of 
p(x|  y).  This  can  be  avoided  by  making  L  a  negative  delta  function,  so  that  there  is  a  gain 

from  getting  exactly  the  correct  x,  while  all  other  values  are  equally  worthless.  With  this 
spike  loss 

L(x,,x)  =  -5(x,-x)  (13) 

R{x)  =  -p{x\y).  (14) 

Substituting  in  the  Bayesian  expression  for  p{x\  y),  and  since piy)  is  independent  of  x,  the 
X  that  minimizes  R{x)  is  the  same  as  the  x  that  minimizes  a  penalty  functional  3  given  by 


3  =  -  ln(/?(yjx))  -  ln(p(x)) 


(15) 
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If  we  substitute  the  Gaussian  p.d.f  s  of  the  last  section  into  this,  we  get 

3  =  I (y  -  A-(x))^ (O  +  F)~'  (y  -  K(x))  +  ^ (x^  -  x)^  B’‘  (x^  -  x)  +  constant.  (16) 

If,  furthermore,  we  make  K  linearizabie,  we  see  why  the  linear  problem  with  Gaussians  is 
easier  to  solve;  3  becomes  a  quadratic  in  x.  U^g  the  same  algd)raic  manipulations  as  are 
needed  to  establish  the  properties  of  Gaussians  used  in  the  last  section,  and  the  same 
definitions  (1 1)  of  x,  and  A,  gives 

3=^(Xa -x)^A'‘(Xg -x)  +  constant.  (17) 

For  large  problems  it  is  easier  to  find  x,  iteratively,  even  if  3  is  quadratic.  If  K  cannot  be 
linearized  over  the  whole  range  containing  x,,  and  possible  x^s,  then  an  explicit  solution  is 
not  possible.  If  K  is  still  differentiable,  so  that 

iir(x  +  5K)  =  Ar(x)  +  K,&ic,  as&r-^O  (18) 

then  we  can  look  for  the  minimum  of  3  using  a  descent  algorithm.  At  the  minimum,  the 
gradient  of  3  with  respect  to  the  components  of  x  is  zero; 

3'  =  -Kj(0+F)-‘  (y-A:(x))-B-‘(Xb  -X)  =  0.  (19) 


This  formula  is  exact;  we  can  find  the  most  probable  x.  The  next  stage  of  generalization  is 
to  allow  the  p.d.f  s  to  be  weakly  non>Gaussian.  That  is,  we  use  the  Gaussian  formulae 
with  O,,  F„  and  B,  being  slowly  varying  functions  of  x,  whose  derivatives  we  can  neglect. 
We  also  neglect  derivatives  of  K, .  Then  if  we  define  x,  as  the  x  which  minimizes  3,  i.e.. 


Then 


3'  =  -KL  (0„  +  F„  )-■  (y  -  ^:(x. ))  -  B-‘  (x,  -  x. )  =  0 . 
3"  =  Kl  (0„  +  F„ )-'  K„  +  B„''  =  A-' . 


(20) 

(21) 


Then,  in  the  neighbourhood  of  x„ 

P„(x|y)ocN(x|x.,3''-').  (22) 

If  K  is  sufficiently  nonlinear,  or  the  p.d.f  s  are  efficiently  non-Gaussian,  p;(x|  y)  may 
have  multiple  maxima.  We  have  then  to  consider  how  to  decide  which  is  best.  We  can 
generalize  on  the  spike  loss,  by  allowing  the  loss  function  to  be  a  Gaussian; 
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L(x,,x)  =  -N{x,lx.L).  (23) 

As  L  tends  to  zero  this  gives  us  the  spike  loss.  For  the  Gaussian  analysis  problem  we  can 
evaluate  the  convolution  explicitly; 

/?(x,)  =  -N(x,|x.,A  +  L).  (24) 

Thus  the  loss  is  minimum  when  Xj  =x,,  as  we  would  expect.  We  can  use  this  expression  to 
help  us  in  deciding  between  peaks  in  a  non-Gaussian  posterior  p.d.f,  by  assuming  that  the 
peaks  can  be  approximated  by  a  local  Gaussian.  We  assume  the  spread  of  the  entire 
posterior  p.d.f  can  be  characterized  by  S  (i.e.,  S  describes  the  distance  between  peaks).  If 
L»S  then  the  loss  function  is  quadratic  over  the  range  of  significant  probabilities,  and  the 
best  estimate  is  the  mean  of  the  full  p.d.f  (which  may  fall  between  two  peaks).  But  if 
L«S  then  we  may  consider  the  peaks  separately.  Then  if  in  the  vicinity  of  the  ith  local 
maximum  the  p.d.f  is 

p(x|y)  =  P.  N(x|Xi,Ai)  (25) 

Then  the  loss  associated  with  choosing  the  analysis  to  be  at  this  maximum  of  p(x|  y)  is 
given  by 

/?(Xi)  =  -PiN(x,|xi.A.+L).  (26) 

If  S»A>>L  then  Rix^)  =  -p(Xi|y),  and  the  best  peak  is  the  highest.  If  Ai«L«S  then 
R{Xi)  =  -PjX  constant  (independent  of  i),  and  the  best  peak  is  that  with  the  largest  area. 

3  NON-GAUSSIAN  OBSERVATIONAL  ERRORS 
3.1  Gross  error  model 

Lorenc  and  Hammon  (1988)  introduced  a  simple  model  of  observational  errors;  They  are 
uncorrelated,  so  that  each  observation  can  be  considered  separately.  For  each,  either  the 
observation  is  good,  in  which  case  its  error  comes  from  a  Gaussian,  or  it  has  a  gross  error, 
in  which  all  observed  values  over  a  range  of  plausible  values  are  equally  likely.  Thus  we 
have  (for  “plausible”  y) 

p(y  I X)  =  (1  -  P(G))  N(yl  K{x),E)+PiG)  k  (27) 

where  E  is  the  observational  error  variance  (=0+F),  P(G)  is  the  probability  of  gross  error, 
X  is  the  true  value,  and  k  is  given  by 

I  k  dy  =  1. 

phiuiblc  vatuM 


(28) 
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3.2  Posterior  p.d.f.  with  gross  errors 

It  is  instructive  to  look  at  some  simple  posterior  p.d.f.s  resulting  from  this  modd,  before 
going  on  to  the  full  multivariate  analysis  problem.  The  simplest  case  is  for  a  single 
observation  of  one  parameter,  and  a  prior  (background)  estimate  y,,  (=/r(x,^)  from  a 
Gaussian  distribution.  Because  P(y  |  x )  is  non-Gaussian,  the  shape  of  the  posterior  p.d.f 
depends  on  the  difference  between  y  and  y,^  as  illustrated  in  Figure  1 .  Even  in  this,  the 
simplest  case,  there  are  multiple  maxima,  and  th^e  are  configurations  in  which  a 
variational  search,  starting  from  the  prior  estimate  y|^  will  not  find  the  best  value. 


-* 
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Figure  1.  The  p.d.f.s  for  an  observation,  background,  and  Bayesian  analysis  for  a  selection  oS  observatitm- 
badcground  differences  o.  The  p.d.f  s  are  appropriate  for  ship  observations  of  surface  pressure,  with 
P(G)=0.0S.  (Lotenc  andHanunon  1988). 


Figure  2  shows  a  similar  error  modd  applied  to  two  realizations,  each  of  ten  observations, 
from  an  idealized  Doppler  observing  system.  With  poor  signal  to  noise  ratio,  P(G)  may  be 
large  for  such  an  instrument;  we  have  used  P(G)=0.S.  In  the  lower  example,  it  is  not  clear 
which  is  the  “best”  estimate;  no  method  can  consistently  find  it. 
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Figure  2.  Two  examples  of  p.d.f  s  from  simulated  doppler  wind  observations,  with  X(=7,  good  observations 
having  E=9  and  P(G)=0.5.  (Dharssi  et  al.  1992). 

3.3  Variational  descent  algorithms  in  the  presence  of  gross  errors 

Even  in  the  top  example  of  Figure  2  there  are  multiple  maxima,  which  become  more 
obvious  minima  if  we  convert  to  a  ln(p)  penalty  function  3,  so  a  descent  algorithm  must 
start  near  the  correct  value,  if  it  is  to  find  the  absolute  minimum. 

Lorenc  (1988)  used  an  observational  error  distribution  like  (27)  in  a  variational  analy»s 
based  on  minimizing  (IS).  The  possibility  of  gross  errors  converts  the  quadratic  penalty 
function  of  (16)  into  one  with  plateaus  (Figure  3).  If  the  current  estimate  in  an  iterative 
algorithm  is  on  one  of  these,  the  gradient  does  not  well  indicate  which  way  to  adjust 
towards  the  minimum.  Note  that  the  width  of  the  minimum  depends  on  E,  while  the  spread 
of  the  deviations  between  initial  estimate  and  observations  depends  on  B+E.  So  if  B  is 
large,  the  iteration  may  not  move  towards  the  absolute  minimum.  This  was  the  case  in  the 
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experiments  of  Lorenc  (1988).  He  tried  various  methods  to  improve  the  first-guess  of  the 
iteration,  for  instance  by  first  setting  P(G)=0,  but  with  limited  success. 

Dharssi  et  al.  (1992)  had  more  success  in  their  examples.  In  simple  single  value  problems 
like  those  shown  in  Figure  2,  they  found  that  increasing  the  observational  error  E  in  early 
iterations  helped  the  iterative  estimate  move  towards  the  best  value.  In  a  two-dimensional 
simulation  of  winds  from  a  scanning  lidar,  they  found  that  for  relatively  dense  but 
unreliable  (P(G)=0.5)  observations,  the  iteration  did  converge.  It  is  an  open  question 
whether  descent  algorithms,  suitably  modified  in  early  iterations,  will  be  sufficient  for 
practical  applications,  or  whether  we  will  still  need  the  decision  algorithms  described  in  the 
next  section. 


r«RMfH.IZEO  OEVIRTION 

Figure  3.  Solid  line;  quadratic  penalty  function  for  a  single  observation,  dashed  line; 
penalty  function  assuming  a  P(G)=0.05  (Lorenc  1988). 


4  QUALITY  CONTROL 

4. 1  Posterior  probability  of  gross  error 

The  posterior  p.d.f  s  shown  in  Figure  1  are  each  the  sum  of  two  Gaussians,  one 
corresponding  to  there  being  a  gross  error  (G),  one  corresponding  to  the  observation 
being  correct  (G).  Lorenc  and  Hammon  (1988)  proposed  applying  Bayes’  theorem 
directly  to  the  gross  error  event  G; 
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P(G|y)  =  p(y|G)P(G)/p(y) 

=  piy\ G)  P(G)/  (p(y|  G)  P(G)  +  /Ky  |  G)  P(G)). 

Using  (2),  (27),  and  (28),  we  have 

p(y|G)  =  k.  (30) 

Using  (2),  (27),  and  (9),  we  have 

piy\ G)  =  N(y|  A:(x,).E+  KBK"")  (31) 

so  (29)  can  be  readily  evaluated.  The  two  Gaussians  in  Figure  1  are  weighted  by  P(G|  y) 
and  P(G|  y)  respectively.  Thus  accepting  or  rejecting  an  observation  depending  on 
whether  P(G  1  y )  is  greater  than  or  less  than  0.5  is  consistent  with  the  “best”  analysis  in 
terms  of  a  Gaussian  loss  function,  as  discussed  in  relation  to  (26),  as  long  as  S»L»A. 
This  is  the  basis  of  the  decision  taking  algorithms  used  in  Bayesian  quality  control 
schemes. 

Dharssi  et  al.  (1992)  pointed  out  an  interesting  relationship  between  the  variational 
method  and  the  posterior  probability  of  gross  error.  If  we  calculate  3'  using  the  error 
model  (27),  then  we  get 

3'=  -Kj(E,)-‘(y-/i:(x))-B-'(Xb-x)  =  0  (32) 


where  the  diagonal  element  of  E„  for  observation  i,  is  given  by 

(E,),  =  Ei/P(mxnyi).  (33) 

The  Ej  is  the  observational  error  variance  of  observation  i  if  it  does  not  have  a  gross  error, 
and  P(Gi  |  x  n  yj )  is  the  posterior  probability  that  it  does  not  have  a  gross  error,  given  that 
x=x, .  We  are  effectively  increasing  the  assumed  error  variance  of  observations  that  are 
unlikely  to  be  correct.  (This  is  not  the  same  as  the  artificial  increase  discussed  in  section 
3.3,  where  E^  is  increased  when  calculating  Ej  /  PfGj  |  x)  in  early  iterations,  to  aid 

convergence  towards  the  global  minimum).  Equation  (32)  has  the  same  form  as  (19),  so 
by  using  (33)  each  iteration,  a  variational  method  for  Gaussian  errors  is  converted  to  one 
for  non-Gaussian  errors. 
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At  convergence,  there  will  exist  a  final  estimate  of  PCQ  |  x  n  )  for  each  obsnvation.  It 

can  be  considered  to  be  a  variational  quality  control  (VQC)  decision  about  the 
observations’  quality. 

4.2  Individual  quality  control  (IQC) 

Equation  (29)  can  be  extended  to  consider  more  than  one  observation.  Lorenc  and 
Hammon  (1986)  give  the  derivation  for  two  observations; 

P(G,  |y)  =  P(G,  I  y,)/(p(y)/p(y,)p(y2))  (34) 

p^y)i  p(yi  )p(y2 )  =  i  -  p(G,  |  y, )  p(^  |  y2  ){i  -  p(y  |  Gi  G2 )  /  (p(yi  |  g,  )/>(y2 1  Gj))). 

(35) 


Ingleby  and  Lorenc  (1992)  give  a  more  genera)  derivation.  The  number  of  terms  to  be 
considered  in  the  extended  equation  goes  as  2",  where  n  is  the  number  of  observations,  so 
evaluation  of  the  exact  equation  rapidly  becomes  impractical.  Lorenc  and  Hammon  (1988) 
suggest  sequential  application  of  the  “buddy  check”  equation  for  two  observations  as  an 
approximation.  This  is  the  method  used  operationally  at  the  Met  Office.  The  decision 
about  whether  to  use  each  observation  i  is  made  individually,  based  on  an  approximation 
to  its  posterior  probability  of  gross  error  P(Gi  |  y ) .  The  analysis  is  then  made  using  the 
accepted  observations,  assuming  they  have  Gaussian  errors. 

4.3  Simultaneous  quality  control 

The  2"  terms  in  the  full  expression  for  P(G,  [  y)  come  from  the  various  combinations  of 
accepted  and  rejected  observations.  Each  combination  €„  is  associated  with  a  multivariate 
normal  distribution,  each  individually  calculated  using  (10),  so  that  the  total  p.d.f  is  given 
by  Ingleby  and  Lorenc  (1992)  . 


2“-l 

a=0 

The  posterior  probability  for  each  combination  of  gross  errors  can  be  found  using  Bayes’ 
theorem: 

P(Ca  I  y)  =  P(y  I CJ  P(C„)  /  p(y) .  (37) 

If  we  assume  that  each  of  the  Gaussians  which  makes  a  significant  contribution  to  (36)  has 
a  distinct  peak,  then  we  can  apply  (26)  to  decide  which  gives  the  best  estimate  of  x.  If 
S»L»A  it  is  the  one  with  the  maximum  P(Ca|  y). 
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Evaluating  all  2"  probabilities  is  impossible  for  large  n.  Since /Ky)  is  the  same  for  each 
we  can  instead  search  for  the  combination  with  the  maximum  P(y  |  €„)  P(C„) .  The  states 
C„  correspond  to  the  vertices  of  an  n-dimensional  hypercube.  One  possible  algorithm  for 

searching  only  a  small  subset  of  possible  combinations  is  related  to  the  SIMPLEX 
algorithm  in  integer  linear  programming.  We  start  with  an  estimate  of  the  best,  and  then 
search  to  see  if  any  of  its  neighbours  is  more  likely.  Moving  from  one  C„  to  a  neighbour 
corresponds  to  changing  the  quality  control  decision  on  one  observation,  while  keeping 
those  on  other  observations  the  same.  If  one  of  the  neighbouring  combinations  is  more 
likely,  we  can  then  search  its  neighbours,  and  so  on.  This  is  the  basis  of  the  01  quality 
control  algorithm  of  Lorenc  (1981),  which  is  used  at  ECMWP.  Rather  like  the  variational 
descent  algorithms,  this  search  algorithm  relies  on  having  a  good  first  guess  of  the  best  C^, 
since  there  will  in  general  be  multiple  local  maxima 

5.  COMPARISON  OF  QUALITY  CONTROL  CRITERIA 

Figure  4  shows  an  example  chosen  to  illustrate  the  differences  between  the  approaches. 
The  solid  line  shows  the  posterior  p.d.f  given  by  (36),  while  the  dotted  lines  are  the 
constituent  Gaussians.  Variational  analysis,  using  a  spike  loss  fimction,  will  pick  the 
highest  peak  (VAN).  Note  however  that  a  simple  descent  algorithm  would  have  to  start 
quite  close  to  Xvan  **  to  converge  to  the  correct  value;  starting  from  will  not  do. 

Assuming  this  Xy^N  is  correct,  all  the  observations  have  P(Gi|xvAN  oyj )  >  0.5,  so  if  we 
were  to  use  this  as  an  acceptance  criterion,  and  do  a  Gaussian  analysis  using  the 
observations,  we  would  get  the  value  corresponding  to  the  peak  VQC. 

Calculating  the  P(Gi  |  y)  for  each  observation  (IQC),  the  two  observations  of  -9  both  have 

posterior  probabilities  less  than  0.5  (i.e.,  they  fail)  while  the  observation  of  -6  just  passes. 
This  pass  is  in  part  due  to  contributions  from  the  possibility  that  the  other  observations 
were  actually  correct;  IQC  can  give  inconsistent  decisions. 

Simultaneous  quality  control  does  look  for  a  consistent  decision;  in  this  case  the  Gaussian 
with  the  largest  area  is  that  labelled  SQC.  It  corresponds  to  rejection  of  all  the 
observations,  i.e.,  it  is  the  background  distribution.  Note  that  the  SIMPLEX  algorithm  will 
not  work  well  in  this  case.  There  is  one  local  maximum  for  the  combinations  accepting 
both  observations  of  -9,  and  another  for  the  combinations  rejecting  them  both.  The 
SIMPLEX  algorithm  will  converge  to  one  of  these;  it  cannot  get  from  one  to  the  other 
because  intermediate  combinations  (accepting  one  and  rejecting  the  other)  are  less  likely. 

^The  ECMWF  scheme  sets  rejection  tolerances  directly,  but  an  equivalent  formation  similar  to  (29)  is 
possible. 
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Figure  4.  Solid  curve:  P(x  |  y),  dotted  curves:  P(x  j  y  n C„ )  for  yj  =-9,  -9  and  -6,  n,  =0,  and  other 

values  appropriate  for  sea-level-pressure  observations  {E=l,  B=2.25,  k=0.043,  and  P(G)=0.04),  from 
Ingleby  and  Lorenc  (1992).  For  meaning  of  aruiotations,  see  text. 


6  CONCLUDING  REMARKS 

We  have  shown  that  the  Bayesian  approach  provides  a  sound  method  for  combining 
observations  and  background  information.  If  distributions  are  Gaussian,  it  leads  to  the 
statistical  interpolation  (01)  equations  and  to  a  variational  analysis  with  a  quadratic 
penalty  function.  It  also  indicates  how  the  method  can  be  extended  to  observations  with 
non-Gaussian  distributions. 

The  proper  “best”  analysis  depends  on  an  appropriately  defined  loss  function.  Finding  it 
requires  convolutions  over  the  posterior  probability  dfc.isity  function,  which  for  non- 
Gaussian  distributions  is  impractical.  Variational  analysis  (VAN  and  VQC)  and  quality 
control  algorithms  (IQC  and  SQC)  are  making  approximations  to  the  ideal  loss  function. 
In  NWP,  we  have  a  background  which  usually  would  lead  to  a  forecast  that  is  not  too 
bad.  Large  improvements  on  this  accuracy  are  not  required,  so  L^B.  Individual  peaks  in 
the  p.d.f  have  A<B.  So  the  assumption  that  the  region  of  useful  analyses  is  larger  than 
each  peak,  but  smaller  than  the  distance  between  peaks  (S»L»Ai)  may  not  be  too  bad 
for  NWP  assimilation. 


OBSERVATION  QUALITY  CONTROL 


263 


There  have  also  to  be  approximations  in  implementation;  none  of  the  methods  can  be 
implemented  perfectly  in  practical  NWP  problems.  In  the  approximate  forms  discussed 
here. 


VAN  and  VQC  use  a  descent  algorithm,  with  a  modified  penalty  function  in  early 
iterations  to  try  to  get  convergence  to  the  best  x  from  as  wide  a  range  as  possible 
of  first-guesses.  This  has  been  tried  on  simulated  data  by  Dharssi  et  al.  (1992)  and 
is  an  attractive  candidate  for  future  variational  NWP  assimilation  systems. 

IQC,  as  used  at  the  Met  Office  (Lorenc  and  Hammon  1988),  uses  a  sequential 
pairwise  buddy  check  to  approximate  the  method  for  >2  close  observations.  Some 
tuning  of  this  has  been  found  to  be  necessary. 

SQC,  with  a  SIMPLEX  search,  does  not  necessarily  correctly  handle  close 
observations  which  agree  with  each  other,  but  might  both  be  wrong.  The  method 
used  at  ECMWF  (Lorenc  1981)  is  similar  to  this  (although  the  rejection 
tolerances  are  set  directly,  rather  than  via  P(G)). 

The  Bayesian  approach  has  allowed  us  to  understand  the  relationship  between  these 
different  methods. 
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INTRODUCTION 

The  spatial  and  temporal  variability  of  Gulf  Stream  meanders  has  been  studied  by  many 
including  Watts  and  Johns  (1982),  Halliwell  and  Mooers  (1979  and  1983),  Olson  et  al. 
(1983),  and  Comillon  (1986).  The  majority  of  these  studies  use  the  northern  edge  or  north 
wail,  determined  from  the  largest  spatial  gradient  in  advanced  very  high  resolution 
radiometer  (AVHRR)  data,  as  the  Gulf  Stream  path  indicator.  The  advantages  of  using 
AVHRR  data  for  locating  the  Gulf  Stream  are  (i)  the  large  contemporaneous  spatial 
coverage,  (ii)  the  measurements  have  been  collected  daily  since  1978,  and  (iii)  the  frontal 
locations  are  the  strongest  signal  in  the  data.  The  chief  disadvantages  are  the  amount  of 
processing  (geometric  corrections,  cloud-screening/compositing,  and  manual  digitizing  of 
frontal  positions  from  images)  required  and  that  the  satellite  sensor  cannot  see  through  the 
clouds.  Consequently,  there  are  large  spatial  (2-6  degrees)  and  temporal  (3-6  days)  gaps  in 
the  Gulf  Stream  north  wall  position  (GSNWP)  data  set.  Mariano  (1988  and  1990)  devised 
a  new  approach,  termed  contour  analysis,  for  melding  of  oceanic  data  and  for  space-time 
interpolation  of  gappy  frontal  data  sets.  The  key  elements  of  contour  analysis  are  feature 
matching  and  averaging  in  a  coordinate  system  determined  from  the  contour  positions.  In 
applying  his  approach  to  the  GSNWP,  Mariano  assumed  a  dominant  one-dimensional  east- 
west  phase  speed  in  his  algorithm.  This  assumption  restricted  the  application  of  this 
algorithm  to  other  frontal  data  sets,  such  as  the  Brazil-Malvinus  confluence  (Garzoli  et  al., 
1992)  where  the  north-south  phase  speeds  are  also  important,  and  led  to  poor  estimates  of 
the  GSNWP  when  the  north-south  phase  speed  was  significant. 

The  primary  goal  of  this  study  is  to  develop  an  improved  algorithm  for  space-time 
interpolation  of  gappy  frontal  data  sets.  The  major  improvements  are  the  inclusions  of  (i) 
two-dimensional  phase  speed,  (ii)  a  more  autonomous  algorithm,  (iii)  a  better  feature 
matching  algorithm,  and  (iv)  the  inclusion  of  a  temporal  smoothness  constraint.  The 
space-time  interpolator  is  formulated  in  the  framework  of  probabilistic  (Bayesian) 
estimation.  This  report  first  reviews  such  an  estimation  theoretic  framework  and,  in 
particular,  a  Kalman  filter-based  interpolation  algorithm.  Then,  feature  detection  and 
matching  algorithms  are  discussed,  followed  by  presentation  and  discussion  of  some 
preliminary  results. 
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BACKGROUND  I 

1 

I 

The  approach  described  in  this  report  is  a  two-step  process;  First,  the  locations  of  the  sea 
surface  temperature  (SST)  “edges”  (gradient  maximums)  are  detected  and  digitized  by 
trained  personnel  at  the  University  of  Rhode  Island  (URI).  Then,  the  longitude-latitude 
coordinates  of  the  digitized  points  are  interpolated  by  an  autonomous  computer  program. 

This  report  describes  this  second  step — a  probabilistic  approach  to  the  development  of  a 
space-time  interpolation  algorithm. 

The  space-time  interpolation  problem  is  formulated  as  a  quadratic  optimization  problem. 

Here,  we  review  how  the  cost  function  can  be  optimized  using  a  Bayesian  estimation 
framework  (with  additive  white  Gaussian  noise  models)  and  how  the  solution  can  be 
obtained  time-recursively  using  Kalman  filters. 

1.  Space-only  interpolation 

We  first  discuss  the  problem  of  interpolating  points  digitized  from  a  single  frame  of  image, 
as  this  is  the  first  step  of  our  space-time  interpolation  algorithm.  Let  (Jc,,y;),i  =  l,2,...,m 
be  the  longitude-latitude  coordinates  of  the  digitized  points.  We  assume,  for  the  time 
being,  that  the  latitudes  of  the  GSNWP  can  be  described  by  a  function  of  the  longitudes 
X  only,  i.e.,  there  exists  a  single-valued  fancAonyix).  This  is  a  mathematically  convenient 
description  used  in  the  previous  studies  of  Gulf  Stream  variability,  but  it  is  not  always 
appropriate  for  Gulf  Stream  meanders.  The  bi-variate  formulation  for  multi-valued 
features,  such  as  “S”  and  “ft”  shaped  meanders,  is  discussed  after  analyzing  the  simpler 
single- valued  case. 

The  function  is  interpolated  based  on  the  measurements  (x,  ,y,  )  by  finding  the  function 
that  optimizes 


min 

y 


X 


5 

2 

2' 

L 

a, 

+  02 

dx 


(1) 


where  Vj  are  the  weights  representing  our  confidence  in  the  corresponding  measurements. 
The  two  integral  terms,  weighted  by  the  parameters  a,  and  otj,  control  continuity 
(“tension”)  and  linearity  (“smoothness”)  of  the  interpolated  curve,  respectively.  This 
optimization  approach  finds  applications  in  general  geophysical  interpolation  and 
variational  problems  (e  g.,  Inoue,  1986). 
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2.  Maximum  likelihood  estimation 

To  obtain  a  numerical  solution  of  Eq.  (1),  the  longitude  is  discretized  as 
X  =  jAxJ  =  1,2. The  interval  Ax  is  chosen  small  enough  for  the  discrete  domain  to 
include  (within  a  reasonable  quantization  error)  the  measurements  as  {xj  c  {x(yAx)}, 
which  implies  m  <  n — ^the  number  of  points  to  be  estimated  is  usually  three  to  four  times 
the  number  of  data  points.  The  corresponding  latitudes  are  represented  by  an 
/f-dimensional  column  vector  y  whose  elements  are  y(jAx)J  s  [l,n],  while  the 
measurements  of  the  latitudes  are  organized  as  an  m-dimensional  vector  z  whose  elements 
are  yj  e  [l,m].  A  discrete  version  of  Eq.  (1)  is 


n.ma,|Sjf4-o,3||S.yf  +  |i-Hy|f„ 


(2) 


where  the  vector-norms  are  weighted  2-norms,  e.g.,  ||  z - Hy  ||^  =(z-  Hy)^ M(z  -  Hy) 
(The  superscript  denotes  matrix  transpose.)  The  matrixes  S,  and  Sj  are  the  first  and 
second  order  difference  operators,  respectively,  while  M  is  a  diagonal  matrix  whose 
diagonal  elements  are  the  measurement  weights  v,,f  e  [l,m].  The  mxn  matrix  H  is  the 
data-estimate  correspondence  operator  whose  (ijJ^  element  h^j  is  defined  as 

(0  if  otherwise. 

The  process  of  determining  the  matrix  H — ^the  correspondence  problem — is 
straightforward  in  this  case  where  the  latitudes  are  treated  as  a  function  of  the  longitudes. 
Some  GSNWP  features,  such  as  an  “S”  shaped  meander,  can  make  the  correspondence 
problem  quite  complex.  Mariano  (1990)  showed  that  detecting  and  matching  such  features 
based  on  the  sparse  sets  of  data  points  are  the  key  (and  most  difficult)  components  for  a 
successful  interpolation  scheme.  Our  solution  to  the  correspondence  problem  is  presented 
in  the  next  section. 

The  minimizing  solution  y  of  Eq.  (2)  is  exactly  the  maximum  likelihood  estimate  y  based 
on  the  observation  equation 


z 

H' 

Vh' 

0 

= 

s, 

y+ 

V, 

0 

S2. 

(4) 
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where  the  additive  observation  noise  v^,  Vj.  and  Vj  are  mutually  independent  zero-mean 
Gaussian  random  vectors  with  covariance  M  respectively.  The  solution  of 

this  probabilistic  estimation  problem  requires  minimization  described  in  Eq.  (2)  (Lewis, 
1986);  thus,  the  maximum  likelihood  formulation  based  on  Eq.  (4)  constitutes  a 
probabilistic  interpretation  of  Eq.  (2).  An  advantage  of  this  probabilistic  version  is  that  the 
estimation  error  covariance  can  be  computed,  along  with  the  estimate  itself,  allowing  us  to 
quantify  confidence/uncertainty  in  the  solution.  For  Eq.  (4),  the  optimal  estimate  y  and 
estimation  error  covariance  P  are  given  by 

y  =  L-*H'’Mz  (5) 

P=L-‘  (6) 

where  L  =  H^MH+  a,S[S,  +  is  a  sparse  penta-diagonal  matrix.  Alternatively,  the 

minimization  problem  Eq.  (2)  can  also  be  reformulated  as  a  Bayesian  estimation  problem 
in  which  the  first  two  terms  in  Eq.  (2)  are  interpreted  as  the  prior  statistics  for  the 
unknown  y  (Szeliski,  1989).  Both  the  Bayesian  and  maximum  likelihood  formulations  are 
equivalent  when  Gaussian  noise  models  are  used,  as  they  yield  the  same  solution. 

In  terms  of  selecting  the  parameters  for  the  interpolation  problem,  the  probabilistic 
formulation  must  be  specified  slightly  more  precisely  than  its  variational  counterpart  .  In 
Eq.  (2)  the  weights  a,,  Oj,  and  M  are  only  required  to  be  specified  up  to  a  multiplicative 
constant — only  the  ratios  among  the  weights  need  to  be  controlled.  The  same  parameters 
in  the  probabilistic  formulation  Eq.  (4)  play  the  roles  of  noise  covariances  whose  values 
(not  just  the  ratios  among  them)  must  exactly  be  given.  This  extra  bit  of  precision  is 
necessary  for  the  computed  P  to  be  interpreted  meaningfully  as  the  estimation  error 
covariance. 


3.  Time-extension  and  Kalman  filtering 

Equation  (1)  can  be  extended  temporally  to  perform  space-time  interpolation  for  )ix,t) 
using  an  additional  continuity  constraint  over  time; 


K  mjk) 


"V"  S  X  Jo  L 


k=l  1=1 


a. 


d 


+  an 


dx^ 


d 


dxdt  (7) 


where  the  time  variable  is  discretized  as  r  =  kLt,k  =  1,2,...,^  and  the  variables  associated 
with  the  measurements  are  indexed  by  k.  In  the  GSNWP  estimation  problem.  A/  is  two 
days.  The  parameter  controls  the  strength  of  the  temporal  constraint. 
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A  discrete  and  probabilistic  interpretation  of  Eq.  (7)  can  be  obtained  by  supplementing  Eq. 
(2)  with  an  evolution  equation  (8)  representing  the  time-continuity  constraint.  The  result  is 
a  stochastic  dynamic  system  indexed  by  the  time  variable  k. 

y(*)  =  y(*-l)+w(*)  (8) 


■*(k)' 

■H(k)' 

■v«(*)' 

0 

= 

s. 

y(k)+ 

v,(k) 

0 

.  s.  . 

Vj(k) 

where  w(A:)  is  a  zero-mean  Gaussian  random  vector  with  covariance  .  Representing 
the  space-time  interpolation  problem  as  a  dynamic  system  is  attractive  because  the  Kalman 
filtering  algorithm  (Gelb,  1974)  allows  computational  efficiency  (time-recursive 
computation)  and  flexibility  (filtered,  predicted,  and  smoothed  estimates).  Numerical 
solution  of  the  space-time  interpolation  Eq.  (7)  is  given  by  the  smoothed  estimate,  which 
can  be  computed  as  a  linear  combination  of  forward  and  backward  filtered  estimates  based 
on  the  system  Eqs.(8,9):  Let  (y^  (k),  (k))  be  the  estimate-covariance  pair  (the  forward 

estimate)  produced  by  the  Kalman  filter  based  on  the  system  equations.  Then,  Eq.  (8)  is 
replaced  by  a  backward  dynamic  equation  y(k)  =  y (k  + 1) + w(jfc  + 1)  to  compute  the 
backward  filtered  estimates  and  covariances  (y»(k),P{,(k)).  The  smoothed  estimate- 
covariance  pair  (y(it),P(k))  is  given  by 

y  (k)  =  P(k){p;‘  (k)y/  (k) + P;'  (k)y,  (k)  -  H*'  (k)Mz(k)}.  ( 1 0) 

P(k)  =  {p;*  (k)  +  p;‘  (k) -  (k)MH(k)  -  a,S[S,  -  a,sls, .  (1 1) 

Detailed  derivations  can  be  found  in  textbooks  such  as  Lewis  (1986)  and  Anderson  and 
Moore  (1979). 

Figure  la  illustrates  that  the  formulation  Eq.  (7)  performs  adequate  interpolation  for  a 
simple  ideal  case  in  which  y  is  in  fact  a  function  of  x.  Here,  for  each  integer  value  of 
X  €  [1,100]  and  r€[l,10],y,  is  computed  as 

where  u  €  [0,0.2]  is  a  uniformly  distributed  random  number.  The  “measurements”  are 

made  by  selecting  25  points  along  the  curve  for  each  t  (Fig.  la).  All  measurements  over 
the  10  time-frames  are  shown  in  Figure  lb  by  superposition.  The  interpolated  curve  (the 
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Figure  1.  (a)  An  example  of  space-time  interpolation  using  the  formulation  Eq.  (7)  is  shown  as  the  solid 
curve.  The  dotted  curve  is  the  “truth”  while  the  circles  are  the  “measurements”  made  in  this  particular 
time-frame.  The  dashed  curve  is  a  result  oNained  by  adding  the  “temporal  linearity”  term  (cf  Eq.  (IS)) 
into  the  formulation,  (b)  All  the  “measurements”  superimposed  over  time. 
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solid  line  in  Fig.  la)  estimated  the  crests  of  the  waveform  reasonably  well.  The  parameta^ 
used  were  M  =  I,  a,  =  0.01,  04  =1,  and  ^,  =  0.1. 

FORMULATIONS 

1.  Bi-variate  unknown 

Problems  with  uni-variate  formulation  (i.e.,  assuming  that>'  is  a  function  of  jc)  include 
inability  of  representing  certain  frequently  occurring  shapes  of  meanders  (e.g.,  large  “S” 
and  “O”  shapes)  and  inability  to  model  uncertainty  in  the  measurements  of  the  longitudes 
X.  The  spatial  domain  of  interpolation  must  be  dynamic,  rather  than  fixed,  to  correctly 
assimilate  measurements  in  time  under  temporal  movements  of  the  GSNWP.  A  dynamic 
reference  frame  is  crucial  to  GSNWP  interpolation  as  smoothing  over  a  fixed  spatial  grid 
will  smear  out  meanders  and  other  important  shape  features  along  the  contours,  as 
described  by  Mariano  (1990)  in  a  more  general  context  of  data  melding.  It  is  an  adaptive 
(“object-oriented”)  reference  frame  similar  in  spirit  to  the  Lagrangian  frame.  Unlike  typical 
Lagrangian  formulations,  in  which  physical  motion  models  are  available,  our  problem  must 
deal  with  phenomenologically  characterized  motions  of  the  GSNWP  contours,  making  the 
formulation  challenging  because  of  lack  of  accurate  mathematical  models. 

We  will  convert  Eq.  (7)  to  a  bi-variate  formulation.  Let  p{s,t)  s  [jt(s,r),y(s,r)]^  be  the 
true  contour  location,  where  the  spatial  domain  s  is  the  arclength  along  the  contour  at  a 
given  t.  We  denote  the  points  digitized  from  the  SST  image  as  p  {k),i  e  [l,m(k)].  The 
bi-variate  version  of  Eq.  (7)  is 


K  m(k) 


min  X  X  ''i 


M  jsl 


■a 


ds‘ 


(12) 


dsdt. 


This  minimization  is  more  complex  than  Eq.  (7)  because  sj{k),  the  spatial  coordinates  (in 
terms  of  arclength)  of  the  digitized  points,  are  unknown.  Specifically,  the  origin  of  the 
spatial  index  s  is  difficult  to  define,  since  there  is  no  guarantee  (even  though  it  is  a 
reasonable  assumption  for  the  Gulf  Stream)  that  all  contours  pass  through  a  given  point 
(i.e.,  the  origin)  on  the  x-y  plane.  Also,  sj(k)  must  be  determined  concurrently  as  the 
contours  are  interpolated.  The  arclength,  in  fact,  cannot  be  specified  exactly  Avithout 
knowing  the  contour  p(A:)  itself  A  Kalman  filter-based  solution  for  Eq.  (12)  becomes  an 
adaptive  filtering/smoothing  problem; 


p(k)  =  p(k-l)  +  w(k) 


(13) 
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■q(*)' 

‘H(pik),k) 

'v«(*)' 

0 

s. 

9ik)  + 

v,(it) 

0 

.  S2 

.Vjd:) 

(14) 


where  the  components  of  the  vector  q(A:)  are  Pj(it),i  e  [l,m(ik)l.  Note  that  the  data- 
estimate  correspondence  matrix  ^(p(k)Jk)  is  now  dependent  on  the  state  p(ib). 

Clearly,  Eq.  (12)  must  be  optimized  adaptively;  For  each  k,  dther  of  s,{k)  and  p{k)  is 
estimated  alternately  using  the  best  guess  for  the  other,  and  this  process  is  iterated  for  a 
fixed  number  of  times  or  until  an  agreement  between  the  two  estimates  is  obtained  within 
an  accuracy  parameter.  Because  of  the  gaps  in  the  measurements,  the  estimates  at  the 
previous  frame  (i.e.,  p(Ic  - 1))  are  often  the  best  estimates  of  the  general  shape  of  the 
contour  at  the  current  time.  Thus,  the  problem  of  establishing  correspondence  can  be 
approached  by  incrementally  matching  the  best  available  estimate  of  the  current  contour 
based  on  the  previous  contour  and  that  based  on  the  spatially  sparse  measurements.  This 
important  feature  matching  problem  will  be  addressed  in  the  next  section. 

2.  Imposing  linearity  over  time 

Once  the  data-estimate  correspondence  is  established,  it  is  straightforward  to  expand  the 
dynamic  system  formulation  Eqs.(13,14)  to  incorporate  various  structural  models  for  the 
GSNWP  contours.  For  example,  we  can  impose  a  linearity  constraint  over  time  by 
inserting  an  additional  integrand  term 

(15) 

to  Eq.  (12).  The  corresponding  change  in  the  dynamic  system  is  augmentation  of  the  state 
vector;  the  dynamic  equation  is  changed  to 


■  Pik)  ■ 

'I  0  ■ 

'p{k-\) 

’w,(*)' 

.P(it+1). 

I  -21 

.  P(*)  . 

.WjfA:) 

(16) 


where  w,(*:)  and  W2(Ar)  are  zero-mean  Gaussian  random  vectors  with  covariance 

and  pl^l,  respectively.  Equation  (16)  can  be  written  in  a  more  attractive  form  which 
includes  the  local  displacement  d(lr)  s  p(fc + 1)  -  p(jl:)  as  the  extra  component  of  the  state 
vector.  The  estimates  of  the  local  displacement  field  are  of  interest  in  their  own  right  for 
statistical  characterization  of  Gulf  Stream  dynamics.  The  resulting  reformulation  consists 
of  a  modified  dynamic  equation  and  an  additional  row  in  the  observation  equation; 
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p(4)‘ _ 'l  r  pO:-!)' 


+ 


0 

WjU) 


(17) 


0  =  d(ik)+v,(*)  (18) 

where  v^(ifc)  =  -Wi(*+ 1).  By  replacing  Eq.  (13)  with  Eq.  (17)  and  adding  Eq.  (18)  to  Eq. 
(14),  we  can  jointly  estimate  the  GSNWP  p(ib)  and  the  local  displacement  d(A:).  This 
formulation  is  the  same  in  spirit  as  the  approach  used  by  Mariano  (1990),  except  that  the 
formulation  presented  here  uses  two-dimensional  (bi-variate)  displacement  vectors,  instead 
of  one-dimensional  in  the  previous  approach,  and  that  the  presented  formulation  is  optunal 
in  the  least  square  sense. 

The  formulation  based  on  Eqs.(17,18)  is  applied  to  the  uni- variate  example  in  the  previous 
section,  i.e.,  a  temporal  linearity  constraint  (i.e.,  Eq.  (15)  imposed  on^  instead  ofp)  is 
added  to  Eq.  (7).  The  dashed  line  in  Figure  la  shows  one  of  the  resulting  interpolated 
curves.  The  figure  shows  that  the  curve  has  gained  more  “stiffhess”  and  the  crests  of  the 
waves  are  estimated  more  accurately  with  this  extra  constraint  (dashed  line)  than  without 
it  (solid  line).  The  parameter  used  for  the  constraint  was  ^=0. 1 . 

FEATURE  MATCHING 


This  section  describes  an  approach  to  establish  the  data-estimate  correspondence.  For 
conciseness  in  discussion  we  discuss  the  filtering  problem  based  on  the  dynamic  system 
Eqs.(13,14).  As  mentioned  before,  we  adopt  an  adaptive  filtering  approach  where  best 
predictions  of  the  GSNWP  contour  at  a  given  time-firame  k  are  used  to  estimate  the 
positions,  i.e.,  arc-length  indexes  s,(k),  of  the  measurements  along  the  contour. 
Specifically,  two  rudimentary  contours,  one  predicted  ahead  in  time  based  on  the 
estimated  contour  at  ^-1  and  the  other  interpolated  only  over  space  based  on  the 
measurements  at  k,  are  “matched”  for  correspondence,  allowing  incorporation  of  the 
measurements  to  update  the  predicted  GSNWP  estimate.  In  another  words,  the  matrix 
li(p(k),k)  in  Eq.  (14)  is  evaluated  as  H(p^(it-  l),it)  in  the  forward  filter  and  as 

H(Pi,(fc+ l),fc)  in  the  backward  filter,  where  P/(ifc)  and  p^I/fc)  represent  the  forward  and 
backward  filtered  estimates,  respectively.  The  two  contours  are  matched  hierarchically — 
using  larger-scale  “features”  first  and  then  smaller,  more  local,  inflections  of  the  curves. 

1.  Feature  detection 


Large  bends,  especially  those  at  the  apexes  of  the  meanders,  are  the  major  features  along 
the  GSNWP  contours.  Although  these  features  are  always  associated  with  relatively  large 
values  of  curvature  (second-order  derivative  along  the  arc),  such  local  attributes  alone  are 
not  necessarily  useful  in  isolating  large  meanders  among  a  variety  of  contour  inflections 
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with  much  smaller  magnitudes.  In  fact,  the  magnitudes  of  the  inflections  themselves  can  be 
used  to  identify  the  meander  features  more  directly.  These  magnitudes  are  computed  as 
the  deviations  from  a  progressively  fine-scaled,  piece-wise  linear  approximation  of  the 
contour  shape.  Specifically,  consider  a  segment  of  the  curve  between  two  arbitrary  points 
p{Sg)  and/K^^)  Let  the  deviation  be  the  (perpendicular)  distance  from  the  point 

p{s)  along  the  segment  (5  e  to  the  line  connecting  points and/>(5^),  as  shown 

in  Figure  2.  The  points  along  the  curve  where  large  deviations  occur  are  used  to  segment 
the  curve  into  a  piece-wise  linear  “skeleton”,  exemplified  in  Figure  3.  Those  points 
associated  with  large  deviations  are  the  nodes  of  the  skeleton  of  the  curve.  The  following 
is  an  iterative  algorithm  to  compute  the  set  of  nodes,  or  node  set,  given  the  tolerance 
parameter  e  for  the  deviations: 

1 .  Initialize  the  node  set  with  the  two  end-points  of  the  curve. 

2.  Let  the  number  of  nodes  in  the  set  be  L.  Let  the  indexes  of  the  nodes  be  Sg  so  that 
St<s^i^^^  for^=  1,2,...,(L-1). 

3.  Find  the  maximum  deviation  d*  over  the  entire  curve,  i.e.,  for  f  =  1,2,..,,(L  - 1), 

if  *  =  max  max  ^(5,  ) 

Let  5*  be  the  spatial  index  for  the  point  where  the  maximum  deviation  occurs. 

4.  If  d*>€ ,  include  s*  into  the  node  set;  then,  go  back  to  step  2  and  repeat.  Otherwise 
(d*<e),  stop. 

The  internal  node  points, /?(52), /K-Ss),  after  the  final  iteration  are  referred  to  as 

the  feature  points. 

2.  Feature  matching 

Let  us  consider  matching  feature  points  from  two  curves.  Each  feature  point  is  at  the  apex 
of  a  comer  on  the  skeleton  of  a  curve.  A  cost  is  assigned  to  each  of  possible  matching 
pairs  of  feature  points  as  a  sum  of  the  costs  associated  with  the  distance,  angle,  and 
direction  of  the  comer.  Let  p(a)  and  p(b)  be  feature  points  from  each  of  the  two  curves. 
Each  feature  point,  say /?(«),  is  a  junction  of  two  line  segments  of  the  skeleton;  let  the  two 
unit  vectors  pointing  along  these  line  segments  and  originating  in  the  feature  point />(a)  be 
and  1/^2  Let  and  u^2  similarly  defined  unit  vectors  around  the  feature  point /;(^>). 
Also,  we  measure  the  direction  of  a  vector  v  as  the  angle  Z(v)  in  radians  (in  the 
longitude-latitude  coordinate  system).  The  costs  are  defined  as 
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1 .  Distance.  Q  =  \\p„  -p„f 

The  distance  between  the  pair  of  points. 

2.  Angle.  C,=(\Z(uJ-Z{u^)\-\Z(u,^)- Z(u,,)\f 

The  absolute  value  of  the  difference  between  the  angles  of  the  comers  associated  with 
each  of  the  two  feature  points. 

3  Direction.  C,  =  |Z(a„  +  m,2  )  “  +  “m  >1^ 

The  difference  between  the  directions  of  the  openings  of  the  two  comers,  i.e.,  the 
directions  of  the  vectors  bisecting  the  angles. 

We  penalize  large  values  of  these  cost  functions  more  heavily  (i.e.,  more  than  by  a  linear 
proportion)  than  relatively  small  values.  This  is  achieved  by  post-distorting  the  cost  by  a 
piece-wise  linear  mapping  function,  such  as  th^t  shown  in  Figure  4,  which  discounts 
smaller  cost  values  and  inflates  larger  values  uy  multipliers  (slopes  in  the  figure)  smaller 
and  larger,  respectively,  than  1 . 

The  pairs  of  feature  points  with  smaller  total  costs  (the  sum  of  three  post-processed  cost 
functions)  are  considered  to  be  matching  pairs,  with  the  following  constraints; 

•  The  total  cost  for  any  matching  pair  must  be  smaller  than  a  specified  value,  which  we 
will  refer  to  as 

•  A  feature  point  cannot  be  matched  to  more  than  one  other  feature  point. 

•  The  line  segments  connecting  matched  feature  points  can  never  cross  each  other. 

The  last  constraint  reflects  the  stmctural  integrity  of  the  meanders  (features);  The 
GSNWP  meanders  can  only  appear  and  disappear;  they  cannot  change  their  sequencing 
order  along  the  contour. 

To  summarize,  the  number  of  the  parameters  to  be  specified  for  feature  point  matching  is 
10;  C„^  and  the  two  multipliers  and  a  threshold  value  (the  slopes  and  “th”  in  Fig.  4  for 
each  of  the  three  cost  functions  Cj,  C2,  and  C3. 

3.  Local  matching 

Once  correspondence  of  major  features  is  established,  non-feature  points  can  be  matched 
by  a  simple  proportional  mapping,  leading  to  a  correspondence  match  of  the  two  contours 
in  their  entireties.  In  Figure  5,  for  example,  the  pairs  of  points  (A, A')  and  (B.B')  represent 
matched  feature  points,  and  arc-length  indexes  s  and  s’  along  the  two  contour  segments 
between  the  respective  feature  points  are  considered  to  be  a  matching  pair  if 
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Figure  4.  A  typical  mapping  fiuiction  for  postprocessing  of  the  cost  “C”  (representing  C, ,  Cj ,  or  Cj ). 
The  values  smaller  than  the  threshold  “th”  are  discounted  while  values  larger  are  inflated. 


Figure  5.  Matting  contour  segment  AB  to  segment  A  'B'. 
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jlZid_  =  ilz£l 


(19) 


where  s^,  s\,  Sg,  and  Sg  are  indexes  of  the  feature  points. 

Unfortunately,  matched  pairs  of  feature  points  are  sometimes  too  sparse  to  be  able  to 
guide  correspondence  of  the  two  contours  reliably:  Distance  between  adjacent  feature 
points  on  a  contour  can  be  larger  than  the  phenomenological  length  scale,  a  gap  in 
measured  points  can  occur  between  feature  points,  and  some  measurements  do  not  contain 
any  significant  meander  features. 


To  remedy  this,  we  need  a  secondary  method  to  register  the  indexes  for  two  given 
contours  without  relying  on  feature  identification  and  matching.  One  way  of  performing 
such  a  task  is  to  deform  one  of  the  contours  toward  another  using  a  variational 
formulation  involving  cost  terms  for  structure  of  the  deformed  contour  and  for  distances 
between  points  on  two  contours.  Let  p^{s)  and  p^is)  be  the  two  contours  to  be  matched 
and  p{s)  be  a  aeformation  of p^is).  The  deformed  contour  p{s)  inherits  the  indexes  of 
thus,  by  physically  registering  p{s)  onto /?2(j),  correspondence  between  the  two 
index  sets  can  be  found.  [Such  a  technique  for  contour  registration  is  genetically  known  as 
“snake”  in  computational  vision  (Kass  et  al.,  1988)].  Specifically,  we  consider  the 
optimization  problem 


min 

p 


+yo\\p-Pif +ri 


d  ^  ^  ^ 


(20) 


where  the  “gravity”  term  F{p^,  p)  works  to  minimize  the  distances  between  points  along 
p{s)  and  P2{s)  and  is  given  by 


FiPt^p)  =  “  P2  (^')ir 


(21) 


The  domains  Q  and  Q  of  the  integrations  are  given  by  the  contours  p^  and  pj. 
respectively.  The  three  cost  terms,  with  coefficients  Yq.  7i.  and  Yj  contain  the  shape  of  pis) 
from  becoming  radically  different  from  that  of Piis).  The  minimizing  pis)  is  given  by  the 
non-linear  Euler-Lagrange  equation 
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(a2+r2)^-(a,  +  y,)^+yo 


-2 


Pi+-^f(P2^p)  =  0 


which,  since  p  is  the  only  variable,  can  be  written  concisely  as 
2 


a*  ^ 

(a2  +  y2)^-(a,  +  y,)^+yo 


p-p  +  ~Fip)  =  0 
dp 


(22) 


(23) 


where 


P  =  2 


Yi 


d* 


-y.-^+yo 


a/ 


Pv 


Given  a  parameter  k,  Eq.  (23)  can  be  solved  iteratively  (Kass  et  al.,  1988)  as 

2! 


d*  a^ 

("2 + + y.)^+ yo + 


p( 


=  p+K-(p,-P(,.,))  — F(P(,.,)) 


(24) 


which  is  equivalent  to  Eq.  (23)  if  ^  p  as  £  «x» .  Given  P(,_i),  Eq.  (24)  can  be  solved 

simply  by  inversion  of  a  fnear  differential  operator  as 


d*  a^ 


Pi 


—  p  +  K{pi  P«_1) )  F(P(,_„  ) 


(25) 


which  we  have  implemented  numerically.  The  iterations  are  initialized  with  Po  =  F 
Graphically,  as  the  iterations  progress,  the  contour  p;(5)  approaches  p2{s)  in  a  structually 
constrained  manner  (from  which  the  name  “snake”  is  derived).  When  k  is  large 
convergence  is  slow;  when  it  is  small  the  solution  becomes  unstable.  We  have  chosen  a 
relatively  small  value  of  k  for  the  first  few  iterations  and  then  a  larger  value  of  k  for  the 
rest  of  the  iterations  to  ensure  convergence.  We  used  a  total  of  about  20  such  iterations 
per  solution  of  Eq.  (22). 


longitude 


Figure  6.  Examples  of  interpolated  GSNWP  contours  (solid  lines).  The  small  circles  are  the  digitized  data 
points.  The  cross  hairs  along  the  32°N  lines  are  the  standard  deviations  associated  with  the  estimated 
contour  points  directly  above  them.  The  lengths  of  the  two  arms  of  each  cross  represent  standard 
deviations  in  the  estimates  of  the  longitude  and  latitude  associated  with  the  estimated  point. 
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RESULTS 

Equation  (12),  along  with  the  feature  detection  and  matching  scheme  discussed  in  the 
previous  section,  has  been  used  to  interpolate  150  frames  of  data  from  the  period  April 
1982  ~  February  1983.  Figure  6  shows  two  of  the  interpolated  GSNWP  contours  along 
with  the  digitized  data  points  (small  circles),  indicating  that  the  bi-variate  formulation  is 
able  to  reconstruct  macroscopic  features  like  the  “S”  and  “O”  shapes  by  interpolating  data 
from  nearby  time-frames.  The  data  points  from  nearby  frames  are  shown  in  Figure  7.  Also, 
the  standard  deviations  (produced  by  the  Kalman  filter-based  algorithm)  in  the 
longitude/latitude  estimates  of  selected  points  are  depicted  in  Figure  6  by  the  crosshairs 
(see  the  figure  caption).  As  expected,  the  standard  deviations  are  larger  away  from  the 
data  points  and  smaller  near  the  data  points. 

The  algorithm  has  been  tested  further  by  “hind-forecasting”:  a  particular  frame  of  data 
points  is  removed,  and  the  contour  in  that  frame  is  then  predicted  by  interpolation  based 
only  on  data  in  other  frames.  Ideally,  the  predicted  contour  matches  well  with  the  actual 
data  points  which  did  not  participate  in  the  interpolation.  (Note,  however,  that  the 
digitized  points  in  a  given  frame  can  sometimes  misrepresent  the  true  frontal  location 
because  of  imaging  noise,  inconsistency  among  the  personnel  who  perform  the  digitization 
task,  etc.)  Figure  8  shows  the  hind-forecasted  contours  of  the  same  two  frames  as  those  in 
Figure  6,  while  Figure  9  (cf  Fig.  10)  shows  the  hind-forecasts  for  another  pair  of  frames. 
In  these  figures,  the  data  points  match  fairly  well  with  the  hind-forecasts,  and,  in  fact,  the 
agreement  between  the  data  and  hind-forecasts  is  observed  generally  throughout  our  test. 
There  are,  however,  several  inconsistent  hind-forecasts,  two  of  which  are  shown  in  Figure 
1 1  (cf  Fig.  12).  As  indicated  in  the  figure,  a  major  flaw  in  these  hind-forecasts  is  inability 
to  resolve  some  fast  movements  of  the  meanders  and  to  detect  transformations  of  the 
meanders  into  rings.  Obviously,  simple  smoothness  constraints  like  those  in  Eq.  (12)  by 
themselves  are  not  able  to  handle  events  such  as  formation  of  rings  and  are  heavily 
dependent  on  the  data  to  resolve  such  events. 

DISCUSSION 

Although  the  present-day  pattern  recognition  and  matching  algorithms  have  yet  to  realize 
flexibility  and  sensitivity  of  trained  personnel,  major  advantages  of  a  mechanized  system  in 
GSNWP  estimation  are  speed,  objectivity,  and  consistency,  which  are  important  in  high 
volume  production  of  the  estimates.  Also,  a  probabilistic  formulation,  such  as  that 
presented  in  this  report,  yields  a  measure  of  confidence  in  the  estimates  in  the  form  of  the 
second  order  statistics  to  facilitate  interpretation  of  the  results.  We  feel  that  such  a 
statistical  interpretation  will  be  enhanced  if  the  uncertainty  (noise  variance)  in  each 
digitized  data  point  is  quantified  by  using  a  probabilistic  edge-detection  algorithm  (e.g., 
Canny,  1986)  on  the  SST  images.  A  new  edge  detection  algorithm  using  both  spatial  and 
temporal  constraints  is  being  tested  by  Cayula  and  Comillon  (cf  1990)  at  URI.  A 
symbiotic  merging  of  such  an  edge  detection  with  our  interpolation  algorithm  should 
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Figure  7.  The  digitized  data  points  from  five  frames  centered  around  the  two  frames  depicted  on  Figure  6. 
Each  of  two  colunms  of  five  frames  shows  a  time-sequence  of  the  digitized  data  points,  with  the  third 
frame  being  the  frame  from  Figure  6. 


INTERPOLATION  OF  GAPPY  FRONTAL  DATA 


285 


Figure  10.  The  digitized  data  pc^ts  from  five  frames  centered  around  the  two  frames  dqncted  on  Figure 
9.  Each  of  two  odumns  of  five  frames  shows  a  time-sequence  (rf'  the  digitized  data  points,  with  the  third 
frame  being  the  frame  from  Figure  9. 


Figure  11.  Two  cases  where  hind-forecasts  have  failed,  due  to  temporal  Gulf  Stream  dynamics 
unresolvable  from  this  particular  data  sequence. 


Figure  12.  The  digitized  data  points  from  five  frames  centered  around  the  t  vo  frames  depicted  on  Figure 
1 1 .  Each  of  two  columns  of  five  frames  shows  a  time-sequence  of  the  digitized  data  points,  with  the  third 
firame  being  the  frame  from  Figure  1 1 . 
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reduce  the  effect  of  inconsistencies  in  the  initial  frontal  locations.  We  are  also  considering 
a  higher  order  model  for  contour  dynamics  (Pratt  and  Stem,  1986)  as  an  extension  of  the 
work  presented  in  this  report. 

In  the  near  future,  all  available  digitized  frontal  locations  in  the  Gulf  Stream,  Brazil- 
Malvinus  confluence,  and  Kuroshio  current  systems  will  be  interpolated.  The 
spatial/temporal  variability  and  phase  speed  distribution  of  the  resulting  complete  frontal 
locations  will  be  documented. 
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SOME  NOTES  ON  DATA  ASSIMILATION 
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James  J.  O'Brien 

Mesoscale  Air-Sea  Interaction  Group,  The  Florida  State  University,  Tallahassee,  FL 
32306-3041 

INTRODUCTION 

This  paper  is  a  discussion  of  the  author's  emphasis  on  data  assimilation  in  physical 
oceanography.  The  work  draws  on  recent  work  by  members  of  the  MASIG  Team.  Our 
approach  has  focused  on  time-dependent  models  in  which  parameters  are  estimated 
through  data  assimilation  using  the  variational  adjoint  method. 

It  is  useful  to  adapt  a  paradigm  for  classifying  all  data  assimilation  methods.  I  chose  to 
define  three  groups  of  assimilation  schemes:  (A)  local  polynomial  interpolation  methods, 
(B)  statistical  (including  optimal)  interpolation  methods,  and  (C)  variational  numerical 
analysis  methods. 

In  (A),  the  idea  is  to  expand  the  data  misfit  in  terms  of  some  interpolating  polynomial  in 
the  spatial  vicinity  of  the  data  location;  direct  insertion  or  substitution  or  “bogusing”  are 
some  simple  examples;  Cressman  filters  are  a  commonly  used  meteorological  assimilation 
technique.  No  knowledge  of  the  statistical  property  of  the  data  or  the  model  is  used. 

In  (B),  we  use  statistical  information  of  the  data  error  field  or  the  model  variability  to 
determine  the  adjustment  in  space  and  time.  In  principle,  one  could  estimate  the  cross¬ 
correlation  function  of  the  data  misfit  and  adopt  some  rules  to  adjust  the  model  solution. 
The  simplest  idea  is  the  so-called  nudging  method  where  an  inverse-time  parameter  is  used 
to  estimate  the  variability  of  the  data  misfit.  The  most  sophisticated  example  is  the 
Kalman-Bucy  filtering  method.  In  all  the  implementation  schemes  one  should  imagine  that 
the  physical  model  is  evolving  in  time,  and  a  moment  arrives  when  a  data  value  is 
encountered.  The  data  misfit  is  then  added  to  the  model  field  in  time  and  space.  If  the 
covariance  matrix  structure  is  primarily  spatial,  then  the  simplest  time  structure  for  the 
variability  is  nudging  where  a  linear  time  decay  processes  is  added  to  the  prognostic 
model. 

In  (C),  the  assimilation  scheme  defines  a  statistically  weighted  data  misfit  field,  which  is 
minimized  in  a  construct  such  that  the  complete  physics  of  the  prognostic  model  is 
included  as  dynamical  constraints.  I  will  concentrate  on  examples  of  this  latter  method. 
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THE  VARIATIONAL  ADJOINT  METHOD 

The  essential  ingredients  in  this  data  assimilation  are  a  “nice”  model,  availability  of  some 
‘"useful”  data,  and  a  willingness  to  adjust  the  model  in  some  manner.  Each  of  these 
elements  must  each  be  appreciated.  The  model  should  produce  validated  solittions  that  are 
reasonable  and  “liked.”  The  data  may  be  estimates  of  model-dependent  state  variables  or 
the  data  may  be  any  function  of  a  dependent  state  variable  as  long  as  an  estimate  of  the 
function  can  be  calculated  from  the  model  output.  A  simple  example  would  be  ocean 
altimeter  cross-over  data.  The  difference  in  time  between  two  altimeter  readings  at  a  point 
can  be  estimated  from  the  solution  to  any  ocean  model  that  simulates  sea  level,  and 
therefore  altimeter  cross-overs  can  be  assimilated. 

For  a  contrived  example,  let  us  consider  the  following  model.  Suppose  a  scalar  field, 
c(x,/),  is  advected  by  an  unknown  advection  field,  t/(x),  and  other  processes  are 
represented  by  g(c.p)  where  ^  is  a  poorly  defined  parameter.  We  “like”  our  model  after  we 
guess  u,p  and  the  initial  conditions,  c'(0,x).  We  acquire  some  data,  FXc),  where  F(c)  is  any 
function  of  c  which  we  can  estimate  from  the  output  of  c(x.t).  The  model  is 


c,  +uc,  =  gic,p). 


(1) 


There  are  many  avenues  to  arrive  at  the  variational  problem.  I  choose  simply  to  write 
down  the  functional 

T 

H{c,X,u,P)  =  J  A,(c,  -g)dxdt 

xj 

T 

+]^{Fic)-F'ic))^dxdt 

T 

+j^{u-u'fdxdt 

■\-\^{P-^'?dxdt 

X4  ^ 


where  X(x,/)  is  a  Lagrange  multiplier,  K^,  are  called  Gauss  precision  modulae.  The 
range  of  space  is  over  all  x,  say,  x  e  [0,Z,]  and  periodic,  e.g.,  c(/,x)  =  c{t,x+L).  The  last 
three  terms  are  called  the  cost  function,  which  is  to  be  minimized  subject  to  the  contraint 
that  the  data,  F',  and  the  advection  function,  u,  and  the  parameter,  p,  must  satisfy  the 
model.  The  range  of  time  is  [0,7];  Tis  a  time  later  than  the  last  observed  datum. 
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The  minimum  is  determined  by  the  usual  approach. 


dX 

where  u  and  P  are  now  not  known. 

dH 


yields  c,  +uc^  =  g{c,p) 


^  yields  u{x)  =  u'{x)  -  — ^  f  Xc^  dt 
ou  TK^  • 


(3) 


(4) 


where  we  observe  that  the  correction  of  u  from  its  guess  field  depends  on  the  average  of 
the  product  of  the  Lagrange  multiplier  and  the  spatial  gradient  of  the  dependent  state 
variable. 


dH 

dp 


yields 


=  - ^  Ja 


dp 


dxdt 


(5) 


and 


^yields  ^+(uX\=-X^-^kXF{c)-F\c)]^ 

+ J  A(0,  x)  c(0,  x)  dx  +  J  c(0,  x)dx. 

X  X 

The  next  to  last  integral  vanishes  using  the  lemma  that  a  product  of  periodic  functions  is 
periodic.  We  have  used  the  natural  spatial  boundary  conditions  for  A  and  chosen 
X(^T,x)  =  0.  Note  that  the  last  integral  is  zero  except  at  /  =  0. 


The  solution  procedure  is  as  follows; 

1 .  Guess  u’,  p',  c'(0,x)  and  calculate  the  solution  forward  over  the  time  [0,7]  from  Eq. 

(1) 

2.  Calculate  the  data  misfit,  F-F',  and  the  data  misfit  transfer  function,  F^  and  integrate 
Eq.  (6)  backwards  in  time  from  T  to  zero. 

3.  Next  adjust  the  initial  conditions  and  u(x)  and  P  using  Eqs.  (4,  S,  and  6)  (for  c(0,x)). 

4.  Repeat  1,2,  and  3  as  often  as  desired  in  order  to  assimilate  the  data,  F(c). 

There  are  many  advantages  to  this  algorithm.  It  will  almost  always  converge;  thus  all  the 
data  are  used  and  it  is  eloquent.  I  am  told  that  one  can  contrive  a  case  where  it  will  not 
converge.  There  are  some  disadvantages.  It  is  very  expensive  because  we  have  to  integrate 
two  models  and  save  the  solution  from  both  models,  particularly  when  the  physical  model 
is  nonlinear.  It  may  take  many  forward  and  backward  integrations  to  find  the  miminum. 

An  emphasis  of  current  research  is  to  identify  algorithms  which  find  the  minimum  in  as  few 
integrations  as  possible.  The  present  view  is  to  implement  an  efiicient  conjugate  gradient 
algorithm.  A  further  disadvantage  is  that  this  method  is  difficult  to  teach  to  scientists.  We 
are  beginning  to  have  several  simple  ^camples  that  will  demonstrate  the  method  to 
scientists. 
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A  SIMPLE  REAL  OCEAN  EXAMPLE 

There  have  only  been  a  few  modem  superb  upper-ocean  data  expeditions  that  have 
measured  meteorological  and  upper  ocean  currents.  One  such  experiment  is  LOTUS  from 
which  Briscoe,  Price  and  Weller  have  provided  us  data  to  develop  data  assimilation 
methods.  Suppose  we  wish  to  assimilate  wind  and  ocean  current  data  and  determine  the 
momentum  drag  coefficient  and  the  mixing  function  for  mcinentum,  A{z). 

The  model  equation  is 

dw  ..  d  .  . 

— +  ,/w  =  — (A— )  (7) 


dt 


dz  dz 


where  w  =  u  +  iv.  The  boundary  conditions  are  at 

z  =  0,andp^A— =r 
dz 


where  the  wind  stress  is  calculated  from 


At  the  bottom 


7=PaCoKK. 


2  =  -H.  and  A-—  ==  0. 
dz 


The  initial  condition  for  this  dynamic  system  is  at  r  =  0  and  w  -  w^  We  chose  to 
nondimensionalize  the  system  as  follows. 


(8) 


(9) 


where 


t  = — ,  w  =— ,  z  =— ,  A  =— ,  Cd  =-^,  w'  =  — 2- 

Ts  U  D  s,  ^ 


which  yields  the  model 


dw  .  d  (  . 
dt  AV  dz) 


(10) 


.  dw 


CdKK  forz  =  0 

0  forz  =  -— 

D 


(II) 


and  ^  =  ^5  for/  =  0. 


with 
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Following  the  formalism  developed  in  the  previous  section,  we  define  the  cost  function,  J; 


y  (w,  A,Co)  =  ]rK^\\(yv~  w)^  dl^dt 

t  Z 

+ ]-  Kj\  ( A  -  A)^  1 KTH  (Co  -c^f. 


(12) 


The  functional,  L,  is  the  sum  of  the  cost  function  and  the  constraint 


L(w,A.Co,A)  =  y  +  Jj| 


A(— +jyW-— (A— ))' 

dt  dz  dz 


d^dr. 


(13) 


The  solution  is  found  as  usual  by  solving 


dLiw,A,Cii,X)  _ 


=  0 
=  0 


This  yields  the  modei  plus 


dX 

dL{w,A,CQ,X)  _ 
dw 

^L(w,A.Cp,A) 

dA 

^L(vt',A,Co,A) 

dc^ 


dX  d  dX  ~ 

-^+iX+—{A—)  =  K„{w-w) 
at  ai  dz 


Co  =  Co  J(ka|«aA„^0  +\^a\VaKz=o) 


.  =  A^^f( 


du  dXu  ^  dv  dXv 
dz  dz  dz  dz 


(14) 

(15) 

(16) 


Note  that  we  have  assumed  that  Cp  is  a  constant  and  A  is  only  a  function  of  depth.  We  can 
rescale  the  parameters,  K,  by  using 

—  =  A',  -^  =  K'  and  -^  =  K' 


'‘M 


K. 


K. 


2% 
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This  yields 


BX  d  BX  « 

— +iA  +— (i4~)  =  (w-w) 
at  az  az 

(17) 

= ^  ^  +K|v.A^^)dT 

(18) 

.  ;  \  Bu  BXu  Bv  dAvA  , 

(19) 

Bz  Bz  Bz) 

In  order  to  solve  these  equations  we  need  to  define  a  solution  space.  This  is  shown  in 
Figure  1. 


Figure  1.  Diagram  of  the  vertical  structure  of  the  numerica]  modd. 
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Our  procedure  for  using  the  variational  method  to  solve  this  system  can  be  fully  described: 

( 1 )  Begin  with  a  best  initial  estimate  for  the  control  parameters  A  and  Cp. 

(2)  Integrate  the  model  equation  (7)  forward  in  time  and  calculate  the  value  of  the  cost 
function. 

(3)  Compute  the  data  misfits  ( w  -  w). 

(4)  Integrate  the  adjoint  equation  ( 1 7)  backward  in  time. 

(5)  Use  equations  (18)  and  (19)  to  calculate  the  gradients  of  the  cost  hmctions  VJ 
corresponding  to  A  and  with  solutions  for  X  and  w  from  steps  (2)  and  (4). 

(6)  With  the  gradient  information,  apply  the  descent  algorithm  to  obtain  the  new  values 
of  A  and  Cp  simultaneously. 

(7)  Check  if  the  minimization  process  is  done.  The  convergence  criterion  is  satisfied  if 
|V7|/|VJo|  <  10"^,  where  is  the  value  at  the  initial  iteration. 

(8)  Return  to  step  (2)  if  the  optimal  solution  is  not  found. 

We  will  demonstrate  a  solution  using  currents  over  10  days  in  the  summer  in  the  North 
Atlantic  during  the  LOTUS  experiment.  Figure  2  shows  the  observed  currents  at  S  and  IS 
meters.  Only  a  low  frequency  trend  has  been  omitted  from  the  original  data.  The  cost 
function  is  shown  and  the  gradient  are  shown  in  Figure  3  as  a  function  of  interation.  Note 
that  the  cost  function  reaches  a  “practical”  minimum  in  four  iterations.  The  profile  of  the 
eddy  viscosity  coefficient  and  the  drag  coefficient  are  shown  in  Figure  4.  The  surface  value 
of 0.003  implies  an  “Ekman  Layer  Depth”  of  about  6-8  m.  The  comparison  of  the 
assimilated  data  with  the  data  is  shown  in  Figures  5  and  6.  It  is  seen  that  the  model 
reproduces  the  current  meter  data  above  65  m  quite  well  and  very  poorly  below.  This  is  a 
simple  example  of  ocean  data  assimilation.  This  research  is  available  in  detail  in  Yu  and 
O'Brien  (1991).  In  Yu  and  O'Brien  (1992),  we  also  change  the  initial  condition  with 
improvement  (Table  1).  There  are  additional,  completed  examples  of  this  work  showing 
how  to  assimilate  sea  level.  In  this  report  I  have  not  tried  to  reference  all  the  important 
works  by  other  research  teams. 


Table  1. 

Change  of  Correlation  Coefficient  with  Depth 


DeothZ 

New  r* 

QlsLl 

5 

0.92 

0.87 

25 

0.88 

0.81 

35 

0.71 

0.67 

75 

0.34 

0.28 

95 

0.53 

0.44 

•MaXi4 

1.4x10-3 

2.9  X  10-3 

1.2x10-3 

1.3  X  10-3 

*  Initial  condition  adjusted. 


Time  (hours) 


Time  (hours) 


Figure  2.  Current  observations  at  5  m  (top)  and  IS  m  (bottom). 
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0123456769  10  11  0123456769  10  11 

Iteration  Iteration 

Figure  3.  The  variation  of  (left)  the  cost  function  and  (right)  the  gradient  with  the  number  of  iterations. 
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Figure  4.  (left)  The  variation  of  the  eddy  viscosity  coefiTicient  during  the  iterative  process,  and  (right)  the 
variation  of  the  drag  coefficient  with  the  number  of  iterations. 
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Figure  S.  (Tomparison  of  modelled  (solid  lines)  and  observed  (dashed  lines)  current  qjeeds  »  Oeft)  and  v 
(right)  for  5, 15,  and  25  m. 
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35  m  u  (m/sec)  V  (m/»ec) 


Figure  6.  Comparison  of  modelled  (solid  lines)  and  observed  (dashed  lines)  current  speeds  for  u  Oeft)  and 
V  (right)  at,  35  and  65  m. 
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ABSTRACT 

An  objective  procedure  is  presented  which  allows  the  systematic  determination  of  free 
model  parameters  in  numerical  models.  A  nonlinear  inverse  technique  is  applied  to  fit  the 
model  to  observations.  Optimal  values  for  the  free  parameters  are  found  in  a  systematic 
way  by  minimizing  the  least  squares  distance  between  modeled  and  observed  data. 

The  method  is  applied  to  a  general  circulation  model  (GCM)  of  the  Atlantic  Ocean.  The 
GCM  includes  an  embedded  mixed  layer  model  based  on  the  equation  of  turbulent  kinetic 
energy.  A  number  of  free  parameters  describe,  e.g.,  the  efficiency  of  wind  stirring,  decay 
scales  of  turbulence,  and  so  forth.  They  are  determined  by  fitting  Uk  annual  cycle  of  the 
modeled  mixed  layer  depth  to  climatological  data.  The  parameter  values  and  their  error 
covariance  matrix  are  computed. 

INTRODUCTION 

Adjustable  model  parameters  are  involved  in  almost  all  numerical  ocean  models.  As  an 
example,  horizontal  exchange  of  momentum  or  of  tracers  that  is  due  to  small  scale 
processes  is  commonly  described  by  a  diffusion  term.  Another  example  is  the  drag 
coefficient  that  is  used  to  convert  surface  wind  speed  in  the  atmosphere  to  surface  stress  of 
the  ocean. 

These  parameters  may  serve  different  purposes.  The  diffusion  coefficient  is  primarily 
intended  to  describe  directly  the  effect  of  mixing  and  stirring.  On  the  other  hand  a  much 
larger  coefficient  may  be  necessary  to  insure  numerical  stability  or  damp  out  computational 
modes.  It  is  therefore  necessary  to  defme  exactly  the  purpose  of  the  parameter  involved 
before  values  are  assigned  to  it. 

Values  are  often  chosen  according  to  the  intuition  of  the  modeler.  If  the  intuition  fails,  a 
small  parametric  study  may  help.  One  example  for  this  type  of  study  is  the  treatment  of 
bottom  friction.  Here  models  are  frequently  “tuned”  by  trying  out  a  few  bottom  friction 
coefficients  that  span  two  or  three  orders  of  magnitude.  The  coefficient  that  leads  to  the 
“best  results”  is  then  chosen. 

Paran^ters  of  mixed  layer  models  have  been  tuned  to  fit  the  data  of  a  certain  location,  such 
as  Ocean  Station  Ptqia  (Martin,  1985),  or  of  the  equatorial  Pacific  (Garwood  et  al.. 
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1985a,b).  Some  of  the  parameters,  such  as  the  efficiency  of  wind  stirring  nto,  can  be 
measured  in  laboratories.  It  is  now  our  task  to  find  out  if  the  same  values  are  applicable  in 
the  context  of  a  global  model. 

An  objective  way  to  fit  models  to  data  is  the  explication  of  inverse  techniques. 

Distributions  of  active  or  passive  tracers  may  be  inverted  to  derive  flow  velocities  and 
diffusion  coefficients  (Wunsch,  1985,  Fiadero  and  Veronis  1984,  Olbers  et  al.,  1985).  In 
these  inversions  more  or  less  complicated  models  are  applied.  In  the  following  we  will 
describe  a  method  to  determine  free  parameters  of  highly  nonlinear  models.  The  technique 
is  iterative  and  very  general.  In  the  example  given  below  it  is  ^  mixed  layer 

model  that  is  coupled  to  a  general  circulation  model.  Following  ideas  suggested  by 
Tarantola  and  Valette  (1982),  a  sequence  of  linear  subproblems  is  solved  wherein  each 
solution  is  a  compromise  between  observation  and  prior  information. 

A  brief  summary  of  the  general  circulation  model  and  the  mixed  layer  model  is  given  in 
the  next  sections,  followed  by  the  presentation  of  the  data,  the  inverse  method  and  finally 
the  results. 

ISOPYCNIC  OCEAN  CIRCULATION  MODEL 

A  general  circulation  model  that  uses  isopycnical  coordinates  in  the  vertical  was  used  in 
this  study.  The  model  was  developed  by  Oberhuber  (1993a,b)  and  is  known  under  the 
name  “OPYC.”  It  includes  an  ice  model  with  viscous-plastic  rheology.  The  surface  layer  is 
modeled  as  a  fully  active  mixed  layer  of  variable  depth  in  which  temperature  and  salinity 
may  change  arbitrarily.  The  mixed  layer  is  coupled  interactively  to  the  ice  model  as  well  as 
to  the  deeper,  isopycnic  layers.  One  of  the  intentions  in  deriving  the  isopycnic  model  is  its 
use  in  climate  studies.  For  this  purpose  the  model  formulation  was  made  rather  complete. 

It  combines  primitive  equations  and  the  full  thermohaline  dynamics,  a  realistic  equation  of 
state,  convection  and  detailed  mixed  layer  dynamics  with  an  isopycnical  description  of  the 
deep  ocean.  Topography  is  arbitrary. 

An  early  version  of  the  model  has  been  applied  to  the  tropical  Pacific  (Miller  et  al.  1991). 
The  most  intensive  studies  were,  however,  performed  in  the  Atlantic  Ocean.  The  model 
has  been  described  in  detail  in  Oberhuber  (1993a,b).  The  present  study  was  undertaken  in 
support  of  the  mode)  development  and  an  earlier  version  of  OPYC  was  applied.  The 
results  presented  here  are,  accordingly,  only  preliminary  and  the  successes  of  the  isopynic 
model  should  be  judged  by  the  more  recent  work  of  Oberhuber.  The  model  version  used 
here  has  a  horizontal  resolution  of  2°  by  2°  and  seven  vertical  layers.  It  covers  the  Atlantic 
Ocean  from  30°S  to  80°N  where  it  is  closed  by  artificial  boundaries. 

The  model  is  driven  by  surface  fluxes  of  momentum,  heat,  fresh  water,  turbulent  kinetic 
energy,  and  buoyancy.  The  fluxes  are  calculated  from  monthly  mean  climatological  values. 
Windstress  is  taken  from  Hellermann  and  Rosenstein  (1983).  The  other  fluxes  are  based 
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on  the  COADS  data  set  (Woodruff  et  al.,  1987).  The  climatology  was  calculated  by  Wright 
(1988)  for  the  years  1950  to  1979  on  a  2°  by  2°  grid.  From  these  Oberhuber  (1988) 
derived  all  other  quantities  necessary  to  drive  the  model. 

MIXED  LAYER  MODEL 


Mixing  in  the  surface  layer  is  caused  by  turbulence  generated  by  wind  stirring  and 
buoyancy  fluxes.  The  turbulence  produces  a  uniform  vertical  distribution  of  temperature 
and  salinity.  However  the  turbulent  kinetic  energy  (TKE)  may  vary  with  depth  within  the 
mixed  layer.  Models  of  the  mixed  layer  are  generally  based  on  a  budget  equation  of  the 
TKE:  The  input  of  TKE  at  the  surface  is  balanced  partly  by  dissipation  and  partly  used  for 
the  production  of  mean  potential  energy  by  the  entrainment  of  underlying  denser  water. 
While  wind  stirring  always  acts  as  a  source  for  TKE  the  buoyancy  flux  may  change  sign. 
Cooling  and  evaporation  act  as  production  terms  while  precipitation  and  heating  of  the 
surface  layer  increase  the  stability  and  limit  vertical  mixing.  When  the  warming  is 
sufficiently  strong  detrairunent  occurs.  A  new  shallow  mixed  layer  is  established  in  which 
the  input  of  TKE  by  the  wind  is  used  to  distribute  the  heat  vertically  and  produce  potential 
energy.  In  OPYC  the  underlying  old  mixed  layer  is  redistributed  into  the  isopycnic  layers 
below.  While  entrainment  is  modeled  prognostically  the  detrainment  is  treated  separately. 

At  an  early  stage  in  the  development  of  OPYC  the  mixed  layer  models  of  Kraus  and 
Turner  (1967),  Niiler  (1975),  Niiler  and  Kraus  (1977)  and  Garwood  et  al.  (1985a,b)  were 
applied.  The  experience  gained  from  these  models  led  to  a  new  formulation  for  the  mixed 
layer  equations.  The  major  process  that  governs  the  mixed  layer  depth  (MLD)  is  the 
entrainment/detrainment  cycle.  Additionally,  the  model  includes  changes  in  MLD  due  to 
convergence  of  mass  or  heat. 


The  entrainment  rate  w  is  modeled  by 

whg'  =  w/?i,.„,(Au^  +  Av^)  +  2m„aut  +  hbe{B  -  jfli) 
+  beyBs\ 


f 

f,  (-h] 

Y 

h 

1  +  exp 

—  2hg 

1-exp  — 

. 

J. 

J 

(1) 


-c^fn^xSi-y-dh-d' 


where 


B  =  -^{aQ+PR) 

CpP 


(2) 


(3) 
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(4) 

(5) 

(6) 

(7) 


the  h  is  the  MLD,  g'  the  reduced  gravity  between  the  mixed  and  the  underlying  layer.  The 
critical  Richardson  number  (0.25)  is  denoted  Ricnh  ^  and  Av  are  tlw  differences  in  the 
horizontal  velocities  between  the  mixed  and  tl«  underlying  layer.  The  friction  velocity  is 
denoted  u*,  R  is  the  total  buoyancy  flux  through  the  surface  comprising  the  total  heat  flux 
Q  and  the  equivalent  heat  flux  R  due  to  the  fresh  water  flux  (P-£).  The  buoyancy  flux  J,  is 
produced  by  the  solar  radiative  heat  flux  Qs\  y  describes  the  fraction  of  solar  radiation  that 
is  not  immediately  absorbed  and  that  enters  the  ocean.  The  scaling  depth  for  the  penetration 
is  hg.  If  the  MLD  is  sufficiently  shallow,  lower  layers  may  gain  heat  by  solar  radiance.  In 
tlie  term  involving  the  northward  component  of  the  planetary  rotation  Ciy  allows  the 

exchange  between  horizontal  and  vertical  turbulence  according  to  Garwood  et  al.  (1985a,b). 
The  two  dissipation  terms  d  and  d'  will  act  proportional  to  and  independent  of  the  MLD, 
respectively. 


The  finding  that  less  turbulent  kinetic  energy  is  needed  for  mixing  at  high  latitudes  than  at 
low  ones  is  modeled  as  an  efficiency  term  which  depends  on  the  Ekman  scale.  Two 
functions,  denoted  a  and  b,  describe  which  part  of  tlw  kinetic  energy  input  is  available  for 
conversion  into  potential  energy  at  the  mixed  layer  depth  h.  They  describe  an  exponential 
decay  that  depends  on  -hf  /  ku,  where/ is  the  coriolis  parameter.  Functions  a  and  b  differ 
in  their  length  scales  xand  n.  Here  negative  buoyancy  fluxes  are  treated  like  wind  stirring 
(function  a)  while  a  positive  buoyancy  flux  such  as  cooling  is  considered  to  be  more 
efficient  (function  b).  Buoyancy  fluxes  can  be  scaled  independently  from  wind  stirring 
with  the  coefficient  e. 


When  sea  ice  is  present  additional  terms  appear  in  eq.  (1)  (Oberhuber,  1993a).  In  this 
study,  however,  these  terms  were  not  trea^  as  variable  and  are  not  shown  here  for 
simplicity. 
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In  the  detrainment  phase,  the  entrainment  velocity  w  is  set  to  zero  and  eq.  (1)  is  solved 
diagnostically.  Additionally  the  resulting  Monin-Obukhov  depth  is  bounded  for  small 
values  by  the  MLD  due  to  vertical  velocity  shear  hm: 

Ki  =  Ricri,  ( A  u^  +  Av^ )  /  g* .  (8) 

The  set  of  adjustable  parameters  under  consideration  now  consists  of  the  efficiency  of  the 
wind  stirring  at  the  surface  nto,  the  decay  scales  k  and  fi  which  determine  the  decay 
functions  a  and  b,  the  fraction  y  of  the  solar  heating  and  its  penetration  scale  hg  in  the 
ocean,  the  efficiency  e  of  buoyancy  forcing,  the  coefficient  c  that  governs  the  term, 
and  the  two  dissipation  coefficients  d  and  d'. 

DATA 

Climatological  hydrographic  data  compiled  by  Levitus  (1982)  are  used  to  determine  the 
annual  cycle  of  the  The  problem  is  that  measurements  based  on  turbulence  are  not 
available  for  the  whole  area  of  the  Atlantic  Ocean.  Instead  our  definition  must  be  based  on 
the  effect  of  turbulence  on  the  vertical  stmcture  of  mean  quantities.  Several  ways  are 
possible  to  define  how  deeply  the  surface  layer  is  mixed.  A  common  approach  is  to  define 
the  depth  of  the  mixed  layer  as  the  depth  at  which  either  density  or  temperature  deviate 
from  their  surface  value  by  a  certain  margin.  When  the  bottom  of  the  mixed  layer  is 
characterized  by  large  steps  in  mean  values  of  temperature  and  salinity  the  choice  of  the 
criterion  is  not  critical.  However,  a  definition  of  the  measured  MLD  based  on  temperature 
differences  will  work  more  reliably  in  low  latitudes,  whereas  for  high  latitudes  with  their 
low  vertical  temperature  gradients  a  criterion  using  density  differences  is  preferable.  As  our 
model  includes  latitudes  up  to  80  degrees  north,  we  have  chosen  a  difference  in  <7,  to 
define  the  mixed  layer  depth. 

Monthly  mean  values  of  temperature  and  salinity  are  used  to  calculate  the  mean  density 
profile  at  standard  levels.  Linear  interpolation  is  applied  to  determine  the  depth  at  which  the 
density  differs  from  the  surface  density  by  0.125  kg/m^.  Values  of  the  MLD  of  less  than 
10  m  were  set  to  10  m  while  values  higher  than  400  m  were  excluded  from  the 
comparison  with  modeled  MLD. 

Figure  1  shows  the  monthly  distribution  of  the  measured  MLD.  January  is  depicted  in  the 
upper  left  panel,  April  in  the  upper  right,  etc.,  until  December  in  the  lower  right  comer. 
Contour  lines  are  at  every  100  m  with  additional  contours  at  25  m  and  50  m.  The  MLD  in 
the  Gulf  of  Guinea  is  always  shallower  than  25  m.  Values  increase  poleward  to  over  400 
m  at  the  northern  wall.  Main  features  are  a  shallow  mixed  layer  in  the  equatorial  band  and  a 
strong  seasonal  cycle  in  ipid  latitudes. 
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Figure  1 .  Monthly  mean  depth  of  mixed  layer  derived  from  climatological  data  of  Levitus.  The  depth 
is  defined  by  a  density  difference  of  Ac,  =0.125  kg  m'^.  Upper  left:  January  to  lower  right: 
December.  Contours  at  25,  50,  100,  200,  300,  400  m  depth. 

OPTIMIZATION  METHOD 

Parameters  are  found  objectively  by  minimizing  the  rms  misfit  between  modeled  and 
measured  MLD.  The  method  involves  an  iterative  technique  wherein  first  the  model 
sensitivity  is  calculated  and  second  the  optimal  set  of  parameters  is  estimated,  followed 
again  by  a  sensitivity  analysis,  and  so  forth.  Prior  information  on  the  set  of  parameters  is 
taken  into  account  by  using  the  error  covariance  matrix  during  the  optimization  (Tarantola 
and  Valette,  1982). 

After  choosing  a  first  guess  for  the  parameters  the  full  coupled  sea  ice-mixed  layer- 
isopycnic  model  OPYC  is  integrated.  After  five  years  of  model  time  the  mixed  layer  has 
reached  an  almost  cyclo-stationary  state.  Monthly  averages  of  the  MLD  of  the  last  year, 
denoted  as  hg,  are  stored  for  future  computations. 
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To  find  the  minimum  of  the  data  misfit  the  model  is  linearized  around  its  current  set  of 
parameters.  For  this  purpose  all  parameters  under  consideraticm  are  pertuibed  individually. 
For  each  perturbation  OPYC  is  integrated  for  five  years.  This  integration  period  seemed  to 
be  necessary  for  the  MLD  to  come  to  a  cyclo-stationary  state  after  major  parameter 
changes  such  as  the  introduction  of  the  term.  To  ensure  that  differences  in  the  MLD 
result  from  parameter  changes  and  are  not  due  to  an  undetected  trend  in  the  model,  the 
same  initial  conditions  and  integration  time  as  in  the  control  experiment  are  used  in  the 
perturbation  runs.  The  differences  /i,  between  the  modeled  MLD  of  the  last  year  and  hg  are 
stored  again. 

Our  linear  model  for  the  MLD  then  consists  of  tl^  reference  solution  plus  a  linear 
combination  of  the  perturbations 


h^iX)  =  ho  +  '2,^A-  (9) 

i 


The  vector  X  consists  of  n  components  x,  .  They  describe  which  fractions  of  the  parameter 
changes  £^lied  to  calculate  hi  are  used  to  compute  the  linear  approximation  hmo  of  the 
MLD.  For  the  linear  model  the  data  misfit  J^at  can  be  written  as 


(10) 


with  a  diagonal  weighting  matrix  W  defined  as 


cos(9) 

10m 


(11) 


The  weights  are  proportional  to  the  area  represented  by  the  measurement.  They  are 
normalized  with  a  uniform  rms  of  10  m.  llie  errors  are  assumed  to  be  uncorrelated.  Of 
course  the  weighting  can  be  changed  to  represent  the  error  of  the  individual  estimates  of  the 
MLD.  For  instance  a  weighting  proportional  to  the  MLD  itself  was  tried  out  as  an 
alternative  to  (11).  The  change  in  Uk  optimal  parameters  was,  however,  small.  The 
sensitivity  of  the  results  to  the  choice  of  W  seems  to  be  low.  Of  course  the  absolute  values 
in  W  are  important  only  in  comparison  to  the  standard  deviations  s,  of  the  parameters. 


The  Si  are  used  to  describe  our  a  priori  knowledge  about  the  different  parameters.  This 
information  is  built  up  during  many  previous  iterations  and  model  reconfigurations.  As  the 
result  of  early  iterations  some  of  the  parameters  were  discarded  (e.g.,  by  putting  them  to 
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zero)  or  fixed  to  specific  values.  Values  for  the  remaining  parameters  are  known  better  and 
better  during  the  iteration  process.  Fuithermore  the  sensitivity  of  the  MLD  to  changes  in 
the  parameters  is  known  from  previous  experiments.  This  information  is  used  for  the  first 
guess  and  the  variations  of  the  parameters. 

The  total  function  to  be  minimized  consists  of  the  data  misfit  and  an  additional 
regularization  term,  which  penalizes  the  deviation  of  the  solution  from  its  first  guess, 


(12) 


where  S  =  diag  ((Tj"^)  is  the  inverse  of  the  a  priori  covariance  matrix  of  X.  The  minimum 
of  7(01  can  easily  be  found  by  setting  the  partial  derivatives  to  zero. 

^  =  -2  K.,  -h^-HX)^WH  +  2X^S=  0  (13) 

where  the  matrix  H  consists  of  the  MLD  differences  h,  .  Solving  (13)  for  X  yields 


AX  =  y  (14) 

with 

A  =  H^WH  +  S  (15) 

and 

Y  =  {h^^-hoYWH.  (16) 

From  the  retrieved  X  we  can  directly  calculate  the  optimal  set  of  parameters. 

For  the  estimation  of  the  a  po^':rori  error  covariance  matrix  E  of  X  we  apply  the  singular 
value  decomposition  of  A.  The  advanu^e  of  this  approach  is  that  we  can  easily  use 
alternative  truncated  solutions  with  their  resolution  and  error  covariance  analysis  (Wunsch, 
1989). 

A  =  UAV^  (17) 

where  U  and  V  consist  of  the  eigenvectors  of  A.  Eigenvalues  A*  are  stored  in  descending 
order  in  the  diagonal  matrix  L.  E  can  now  be  calculated  as 
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E  =  ((i-4(i-ac)'') 

"  •  uJS-'u. 

-  —  ^  < 

»  j 

J 


=  2:1^''-''; 


=y 

i  > 

V  ^  r 


V  V 


(18) 


if 

S-'  =  (T^I  (19) 

The  only  problem  that  remains  is  to  find  suitable  perturbations  of  the  parameters,  which 
turns  out  to  be  quite  an  art.  A  lot  of  intuition  and  experience  from  previous  iterations  is 
involved.  The  difficulties  in  deriving  a  “reasonable”  set  will  become  clear  in  the  following. 

First,  it  is  better  to  interpolate  than  to  extrapolate;  When  we  calculate  the  local  gradient  of 
the  MLD  only  small  perturbations  are  used  and  the  computed  hi  will  be  small,  in  general. 

In  order  to  produce  MLD  differences  of  appreciable  size  the  corresponding  x/  must  be 
large,  i.e.,  »1.  With  these  large  coefficients  the  linear  model  h^o  extnqx>lates  and  the 
MLD  will  be  quite  different  from  the  nonlinear  OPYC  using  the  optimized  parameters.  In 
some  areas  the  extraqx)lated  h„„  is  considerably  deeper  than  in  its  neighborhood  or,  on  the 
contrary,  may  even  become  negative.  The  reason  for  this  unrealistic  behavior  lies  in  the 
strong  nonlinearities  of  the  mixed  layer  dynamics.  For  every  gridpoint  there  is  a  time  in  the 
seasonal  cycle  when  the  entrainment  period  terminates  and  detrainment  occurs  with  a 
corresponding  rapid  change  in  the  MLD.  This  decrease  is  often  on  the  order  of  100  m.  A 
small  perturbation  in  the  model  parameters  will  change  the  MLD  both  in  the  entrainment 
and  the  detrainment  phase  only  slightly.  However,  it  will  shift  the  onset  of  the  detrainment 
by  a  few  days.  For  these  few  days  we  compute  large  differences  in  the  MLD  which 
multiplied  by  jc,  »  1  produces  unrealistic  results.  Reducing  the  perturbations  only 
intensifies  the  spiky  appearance  of  the  h,  and  makes  the  response  more  local  in  space  and 
time.  As  we  will  see  below  the  error  between  modeled  and  measured  MLD  consists  partly 
in  a  bias  and  partly  in  a  phase  shift.  Such  a  phase  shift  cannot  be  modeled  successfully  with 
spiky  hi.  As  a  consequence  we  impose  a  constraint  to  ensure  that  the  modeled  MLD  will  be 
an  interpolation  between  meaningfol  solutions  and  no  extrapolation: 


0<jCi<l,  i  =  \..n 


(20) 
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The  changes  in  the  parameters  allied  to  compute  hi  must  be  chosen  accordingly.  Ideally 
the  Xi  should  be  approximately  O.S  at  the  solution  to  ensure  a  good  compromise  between 
gradient  calculation  and  interpolation.  Constraint  (20)  implies  that  for  positive  and  negative 
parameter  disturbances  separate  model  integrations  have  to  be  performed. 

Only  a  limited  number  of  perturbation  experiments  were  done  because  every  run  requires 
several  hours  of  CPU  time  on  a  Cray  computer  for  the  integration  of  OPYC.  Therefore  we 
restricted  the  number  of  perturbations  to  the  minimum.  Only  when  it  turned  out  (which  it 
frequently  did)  that  a  perturbation  was  of  the  wrong  sign  or  was  made  too  small  was  a  new 
model  integration  performed  and  the  set  of  hi  augmented. 

Because  of  the  constraint  (20)  the  solution  of  (13)  is  slightly  more  complicated  than 
described  previously.  If  the  optimal  x,  turns  out  to  be  zero  we  need  another  perturbation  run 
for  the  corresponding  parameter  with  a  changed  sign;  x,  =  1  on  the  other  hand  makes  a 
larger  perturbation  necessary.  Rather  than  overwriting  the  corresponding  hi  we  set  the 
corresponding  x,  to  zero  and  augment  the  set  of  variables.  The  frequent  changes  in  the  set 
of  hi  make  it  necessary  to  retain  all  informaticHi  until  the  Hnal  solution  is  found.  Of  course 
optimal  parameter  changes  cannot  be  positive  and  negative  at  the  same  time.  In  this  case 
the  smaller  change  is  discarded.  The  optimal  x,  then  consist  of  a  number  of  zeros  and 
values  smaller  than  one  where  only  nonzero  values  are  used  for  the  solution.  Once  the  fmal 
set  of  variations  x,  and  hi  have  been  found  we  can  again  apply  equations  (13)  and  following 
to  compute  the  solution  and  its  error  covariance  matrix. 

It  is  still  possible  to  find  gridpoints  where  h(X)  behaves  unreasonably.  For  instance,  in  the 
control  experiment  an  area  might  be  marginally  unstable  and  convection  produces  a  deep 
MLD.  In  most  perturbation  runs  convection  does  not  occur  and  we  have  a  situation  where 
locally  many  hi  are  large  (and  negative).  Their  weighted  sum  may  produce  an  <  0,  that  is, 
a  negative  thickness.  A  similar  argument  can  be  given  for  extremely  deep  values  for  the 
linear  model.  To  safeguard  against  such  a  behavior  we  could  add  another  constraint 

(21) 

1=1 

The  disadvantage  of  (21)  is  that  the  solution  now  may  depend  on  the  number  of  variable  n. 
Also  we  expect  reasonable  values  of  the  x,  to  be  around  O.S.  These  disadvantages  are 
avoided  by  requiring  that  lies  in  the  same  depth  interval  as  the  measurements,  i.e. 

10  m^h^^  400  m  (22) 

Values  which  violate  (22)  are  excluded  from  the  calculation.  Thus  the  number  of  2®  by  2“ 
boxes  involved  in  the  optimization  may  vary  during  the  iteration. 


ESTIMATION  OF  FREE  PARAMETERS 


313 


RESULTS 

a)  Early  results 

It  was  soon  found  that  some  of  the  parameters  retrieved  were  close  to  their  theoretical  ot 
their  values  measured  in  laboratories.  Therefore  y  set  to  0.42  and  Ricni  to  0.25.  Both 
parameters  were  considered  fixed  subsequently.  Another  early  result  was  the  latitude 
dependence  of  the  damping  terms  a  and  b.  Attempts  to  nKxlel  them  indepenctent  of  <p 
failed.  A  scaling  depth  depending  on  ujf,  i.e.,  a  scaling  proportional  to  the  Ekman  depth, 
was  clearly  superior.  Accordingly,  dancing  independent  of  q)  was  no  longer  pursued. 
Value  for  the  damping  parameters  d  and  d'  were  determined  to  be  veiy  small  and  we  set 
d=d'=0.  In  the  same  way  an  independent  efficiency  parameter  e  for  buoyancy  was  found  to 
be  unnecessary  and  e  was  fixed  at  unity. 

b)  Reference  solution 

The  modeled  MLD  is  depicted  in  Figure  2.  It  shows  the  same  characteristics  as  the 
measured  MLD  (Fig.  1).  The  shallow  equatorial  MLD  with  little  seasonal  variation  can  be 
clearly  seen.  Farther  north  the  annual  cycle  is  the  dominant  signal  with  deepest  values  in 
March  and  values  below  25  m  during  summer.  In  the  South  Atlantic,  values  below  25  m 
occur  during  Austral  summer,  i.e.  December  to  February.  High  values  for  the  MLD  are 
found  north  of  50°  N  during  winter  and  spring.  These  are  also  the  areas  of  highest  error  in 
the  MLD  where  the  model  is  much  too  shallow  compared  to  observations  (Figure  1). 
Farther  south  the  errors  are  smaller  with  the  excej^on  of  a  phase  etitMT  in  the  retreat  of  the 
MLD  at  20°  N  during  spring  warming. 

Modeling  of  the  mixed  layer  temperature  (Figure  3)  is  relatively  successful.  In  comparison 
with  measured  sea  surface  temperature,  we  find  enxxrs  below  1  K  for  high  temperatures. 
Errors  increase  to  the  north  where  they  reach  -4  K  (model  too  cold)  at  60°N.  Both  model 
deficiencies  have  been  reduced  noticeably  in  the  meantime.  The  major  improvements  were 
dire  to  the  removal  of  the  northern  wall  in  favor  of  modeling  the  Arctic  Ocean  together  with 
the  Atlantic  (Obertiuber,  1993b). 

Details  of  the  aimual  cycle  of  the  MLD  are  difficult  to  perceive  in  Figures  1  and  2.  Isolines 
are  gappy  because  of  undefined  values.  To  give  a  clearer  picture,  a  number  of  Hovmdller 
diagrams  are  shown  below  that  depict  conditions  along  30°W  as  a  function  of  month. 
Diagrams  of  measured  and  tnodel^  MLD  are  given  in  Figures  4  and  5,  respectively.  We 
notice  measurements  of  a  deep  mixed  layer  in  winter  increasing  to  the  north.  Maximum 
values  are  found  in  April  when  the  depth  of  200  m  extends  south  to  40°N  and  the  100  m 
isoline  almost  reaches  20°N.  During  warming  in  spring  and  sununer  the  depth  is  reduced 
gradually  until  minimal  values  are  found  in  July  and  August.  The  seasonal  cycle  of  the  100 
m  isobaA  is  modeled  fairly  well.  A  deeper  mixed  layor  is  underestimated  and  a  shallower 
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Figure  2.  Monthly  mean  of  modeled  MLD.  Upper  left:  January  to  lower  right:  December.  Contours  as 
in  Figure  1.  The  MLD  is  shallow  at  the  equator  and  becomes  deeper  in  the  region  of  the  trade  winds. 
Farther  north  the  annual  cycle  is  prominent  with  very  deep  MLD  during  winter. 

MLD  is  overestimated.  In  most  of  the  northern  hemisphere  the  seasonal  cycle  is 
underestimated  (Figure  6)  while  in  the  southern  hemisphere  little  variation  is  observed. 

For  completeness  the  Hovmbller  diagram  of  the  mixed  layer  temperature  along  30°W  is 
given  in  Figure  7.  In  the  northern  hemisphere  temperatures  are  lowest  in  March  and 
warmest  in  August.  The  seasonal  cycle  is  most  pronounced  around  30°N.  Differences 
between  modeled  mixed  layer  temperatures  and  sea  surface  temperature  measurements  are 
given  in  Figure  8.  Throughout  the  year  the  error  is  always  below  1  K  in  the  region  south  of 
40°N.  The  annual  cycle  is  mostly  visible  in  the  error  around  bO^N,  i.e.,  close  to  Greenland. 
During  winter  and  spring  the  model  is  too  cold  by  up  to  4  K  and  the  error  is  smallest  ( 1  K) 
during  November. 
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Figure  3.  Monthly  mean  of  modeled  mixed  layer  temperature.  Upper  left:  January  to  lower  right: 
December.  Contour  interval  5°  from  0°  to  25°,  additional  contour  at  27.5°.  The  general  temperature 
distribution  is  close  to  observations  except  in  the  Gulf  Stream  region  and  in  the  vicinity  of  the 
northern  boundary. 

c)  Perturbation  experiments 

After  many  iterations,  five  parameters  remained  to  be  determined  by  optimization.  They 
were  m^,  hs,  c,  k,  and  n .  Their  a  priori  values  and  variances  were  chosen  according  to 
our  knowledge  gained  so  far  (Table  1).  Parameter  disturbances  were  taken  as  twice  the 
respective  rms,  which  implies  expected  rms  values  of  all  non  dimensional  x,  of  0.5  to  fit 
our  requirements  for  interpolation.  The  corresponding  of  matrix  S  are  then  4.0.  We 
will  now  discuss  the  results  of  the  sensitivity  experiments,  i.e.,  the  fields  /t,.  Again 
Hovmoller  diagrams  along  30°W  are  chosen  to  show  both  the  annual  cycle  and  the 
latitudinal  dependence  of  Ae  MLD  differences. 
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Table  1.  Values  and  standard  deviations  of  estimated  parameters 


Parameter 

a  priori 

a  posteriori 

mean 

rms 

mean 

rms 

K 

0.4 

0.083 

0.396 

0.027 

mo 

1.2 

0.2 

1.060 

0.129 

hB[vri 

10 

2.5 

8.077 

1.640 

_ _ 

5 

2.2 

3.541 

1.395 

The  result  of  a  change  in  nto  from  1 .2  to  0.8  is  depicted  in  Figure  9.  The  decrease  in  the 
input  of  wind-induced  TKE  at  the  sea  surface  leads  to  a  corresponding  decrease  in  the 
N^D  over  the  whole  area.  Values  range  from  5  m  at  the  equator  to  50  m  at  50°  N.  The 
sensitivity  is  highest  during  the  detrainment  period  in  both  hemispheres. 

Changing  the  penetration  depth  ha  for  the  solar  radiation  from  10  m  to  5  m  results  in  a 
general  reduction  of  the  MLD  too  (Figure  10).  Solar  heating  is  concentrated  more  toward 
the  surface,  the  buoyancy  input  is  more  negative,  and  the  MLD  reduced  (see  equation  ( 1)). 
As  with  nto,  ihe  highest  sensitivity  is  during  the  retreat  phase.  But  here  we  find  maxima  at 
25°N  and  2S°S.  The  large  positive  change  during  summer  occurs  at  the  coast  of 
Greenland.  It  must  be  attributed  to  a  combined  effect  of  advection  and  convection.  Note 
that  the  MLD  is  undefined  prior  to  June. 

Considering  the  term  (Figure  1 1)  we  also  find  noticeable  changes  concentrated  in 
May.  According  to  the  latitudinal  distribution  of  the  windstress  we  find  a  decrease  in 
MLD  in  the  area  of  the  westerlies  north  of  about  30°N.  Closer  to  the  equator,  easterlies 
prevail  and  the  MLD  becomes  deeper. 

The  sensitivity  of  the  decay  functions  a  and  b  appears  to  be  quite  different  The  effect  of 
changing  icfrom  0.4  to  0.33  is  small  and  concentrated  mainly  in  the  northern  area  (Figure 
12).  On  tlK  other  hand  changing  n  from  5  to  2  results  in  a  large  decrease  of  the  MLD 
outside  the  equatorial  band  (Figure  13).  Differences  are  on  the  mrder  of  20  m  during  the 
time  when  the  MLD  is  deepest.  Of  course  a  reduction  in  fi  will  diminish  the  MLD  only  in 
regions  with  a  positive  buoyancy  flux  B,  i.e.  cooling  or  evaporation.  Outside  these  areas 
there  will  be  no  change.  The  local  deepening  found  at  the  coast  of  Greenland  during 
summer  is  similar  to  Figure  10. 
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measured  mixed  layer  depth 


Figure  4.  Hovmdiier  diagram  of  measured  mixed  layer  depth  at  30°W  as  a  function  of  latitude  and 
time.  Contour  intervals  as  in  Fig.  I .  There  is  a  strong  seasonal  cycle  north  of  40°N. 


modelled  mixed  layer  depth 


Figure  S.  Hovmdiier  diagram  of  modeled  mixed  layer  depth  at  30°W.  Contour  intervals  as  in  Fig.  I. 
The  seasonal  cycle  is  less  pronounced  compared  to  measurements  (Fig.  4). 
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Figure  6.  Hovmoller  diagram  of  error  in  mixed  layer  depth  at  30°W.  C.i.  =  25  m.  During  winter  the 
model  is  too  shallow  in  the  north  and  too  deep  in  the  south.  Detrainment  in  the  spring  is  delayed. 


modelled  ML  temperature  at  30  west 


Figure  7.  Hovmdller  diagram  of  the  modeled  mixed  layer  temperature  at  30®W.  Temperature  rises 
to  28°  C  at  the  equator. 
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eiTor  in  ML-temperatuie  at  30  West 


Figure  8.  Hovmoller  diagram  of  error  in  mixed  layer  temperature  at  30°W.  C.i.  =  1  K.  Temperature 
differences  are  small  except  north  of  40°N  where  they  increase.  The  model  is  too  cold  by  as  much  as 
4  K  near  Greenland. 


paramter  change  from  m0=1.2  to  m0=0.8 


Figure  9.  Hovmoller  diagram  of  difference  in  mixed  layer  depth  at  30°W.  Parameter  nio  was 
changed  from  1.2  to  0.8.  C.i.  =  5  m.  Because  of  the  decrease  in  the  wind  input  of  TKE  the  MLD 
becomes  shallower  by  up  to  50  m. 
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Figure  10.  Hovmdiier  diagram  of  difference  in  mixed  layer  depth  at  30°W.  Parameter  hB  was 
changed  from  10  to  5  m.  C.i.  =  S  m.  Less  penetration  of  solar  heating  concentrates  the  input  of 
negative  buoyancy  more  toward  the  surface.  This  stabilizes  the  mixed  layer  and  decreases  its 
thickness  by  10  m  on  the  average. 
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paramter  omegas 


Figiu^  11.  Hovmdiier  diagram  of  difference  in  mixed  layer  depth  at  30°W.  The  term  involving 

is  included  in  the  calculation.  C.i.  =  S  m.  In  the  region  of  westerly  winds  we  observe  a  retreat 
the  MLD,  whereas  easterlies  lead  to  a  deepening. 
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paramter  change:  kappa^.4  to  kappa=4.33 


month 

Figure  12.  Hovmoller  diagram  of  difference  in  mixed  layer  depth  at  30®W.  The  depth  scale  K  was 
changed  from  0.4  to  0.33.  C.i.  =  5  m.  Less  TKE  is  available  at  the  bottom  of  the  MLD  resulting  in  a 
shallower  mixed  layer. 


paramter  change:  musS  to  mus2 
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Figure  13.  Hovmdller  diagram  of  difference  in  mixed  layer  depth  at  30°W.  The  depth  scale  fi  was 
changed  from  5  to  2.  C.i.  =  5  m.  Much  stronger  damping  of  the  turbulence  produced  by  positive 
buoyancy  results  in  a  retreat  of  the  MLD.  The  decrease  is  restricted  to  areas  of  positive  buoyancy  flux 
B. 
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d)  Inverse  solution 

The  hi  calculated  above  are  now  used  to  solve  (13)  for  the  optimal  vector  X.  It  should  be 
mentioned  first  that  most  of  the  improvement  in  the  data  misfit  was  made  in  previous 
iterations.  The  remaining  error  was  predominantly  systematic  and  could  only  slightly  be 
reduced.  Differences  between  the  optimal  solution  and  the  reference  solution  are  small. 
They  amount  to  a  reduction  of  the  MLD  of  less  than  20  m  for  most  cases  (Figure  14). 
Only  in  areas  with  convection  changes  are  high  and  local. 


optimized  change  in  MLD 


Figure  14.  Histogram  of  the  change  of  the  modeled  MLD  resulting  from  the  optimization  of  the 
parameters.  Most  changes  reduce  the  MLD  by  up  to  10  m.  Positive  changes  (deepening)  are  rare. 

One  of  the  most  important  findings  is  that  parameter  c,  which  is  the  coefficient  of  the 
term,  finally  turns  out  to  be  zero.  The  corresponchng  h,  (Figure  1 1)  leads  to  a  deepening  in 
the  tropical  regions  where  the  model  is  already  too  deep.  Farther  north  the  reduction  in 
MLD  is  of  benefit.  However,  in  this  area  other  parameters,  such  as  or  mg  are  more 
effective  and  are  preferred.  The  modeled  MLD  would  only  be  improved  with  a  negative  c, 
which  violates  the  theory  of  Garwood  et  al.  ( 1985a,b).  Consequently  this  theory  must  be 
rejected  in  the  context  of  our  modeling  effort. 

It  is  worth  mentioning  that  the  data  misfit  (10)  is  larger  for  each  individual  perturbation 
than  for  the  control  experiment.  One  could  be  tempted  to  believe  that  no  improvement  is 
possible.  However,  a  combination  of  small  x,  leads  to  a  reduction  in  7(tat-  This  is  evident 
from  the  gradient  dJu,,  /  dx,  calculated  at  the  reference  solution.  It  is  negative  for  c  and 
positive  for  the  other  four  parameters.  Of  course  the  gradient  must  be  zero  at  the  optimized 
solution. 


ESTIMATION  OF  FREE  PARAMETERS 


323 


The  remaining  parameters  are  found  by  solving 

"4.273  0.264  0.261  0.287" 

0.264  4.968  0.627  0.419 
0.261  0.627  4.772  0.382  ^ 

,0.287  0.419  0.382  4.728, 

The  Xi  are  positive  and  less  than  one  as  required  by  (20).  The  regularization  term  in  ( 12)  has 
a  strong  influence  on  the  solution.  Although  accounts  for  97  %  of  ytot  niost  of  this 
error  is  systematic  and  cannot  be  improved  much  in  the  final  iteration.  Most  of  the 
progress  has  already  been  made  previously.  The  diagonal  elements  of  matrix  A  are 
dominated  by  .  They  imply  that  after  many  iterations  we  have  reached  a  state  where  the 
solution  now  depends  more  on  our  a  priori  knowledge  and  less  on  data. 

Non-dimensional  x,  ate  converted  to  the  corresponding  parameter  difference  and  the 
optimal  set  of  parameters  is  calculated  (Table  1).  Values  for  nio  and  xappear  to  be 
reasonable.  The  closeness  of  k  to  the  Kdrmdn  constant  of  0.4  is  striking.  However,  we 
would  never  propose  to  determine  the  Kdrtn^  constant  via  assimilation  of  measured 
mixed  layer  thickness.  A  p  of  3.5  is  reasonable  too  as  it  allows  less  damping  of  buoyancy 
induced  turbulence  compared  to  mechanically  generated  turbulence.  Finally  a  penetration 
scale  of  solar  heating  %  of  8  m  seems  to  be  too  small.  In  clear  sea  water  Hb  is  on  the  order 
of20m.327 

Error  analysis 

The  optimal  MLD  h„o  is  similar  to  the  first  guess  hg.  A  histogram  of  the  remaining  error  is 
shown  in  Figure  15.  The  result  is  still  biased  with  a  mean  of  tiie  error  of  -7.8  m.  Most  of 
the  MLD  is  overestimated  by  some  tens  of  meters.  A  noticeable  number  of 
underestimations  by  1(X)  m  and  mote  are  also  found.  The  rms  error  after  optimization 
amounts  to  48.7  m.  If  we  now  construct  a  linear  model  fmo  for  the  mixed  layer 
temperature  in  analogy  to  k.^,  where  the  Tj  are  calculated  from  the  temperature  differences 
in  the  perturbation  expeiimer^is  and  the  x,-  ate  taken  from  the  optimization  of  the  MLD,  we 
fmd  only  small  temperature  changes  in  comparison  with  the  reference  solution.  Figure  16 
shows  the  histogram  of  the  final  temperature  errors.  The  differences  to  the  measured  sea 
surface  temperature  are  only  slightly  biased.  The  mean  temperature  error  amounts  to  -0.49 
K  (model  too  cold)  and  the  rms  error  is  1.35  K. 

The  seasonal  cycle  of  the  error  is  shown  in  Figure  17.  The  rms  error  (upper  curve)  and  the 
bias  (lower  curve)  are  plotted  as  a  function  of  time.  Straight  lines  depict  the  annual  mean 
of  the  rms  (48.7  m)  and  of  the  bias  (7.8  m).  We  notice  a  moderate  seasonal  cycle  in  the 
data  misfit  with  the  smallest  values  in  May.  The  bias,  on  the  other,  hand  shows  no 
systematic  time  dependence.  Errors  of  the  mean  are  low  in  March  when  the  MLD  is  deep 


324 


SCHR6TER 


and  in  August  to  November  when  the  MLD  is  shallow.  In  between  the  MLD  is 
overestimated  in  the  mean  by  as  much  as  20  m. 

Next  we  study  the  latitudinal  dependence.  We  find  that  the  error  (full  line  in  Figure  18)  is 
very  small  at  the  equator  and  increases  poleward.  The  maximum  of  about  120  m  rms  is 
found  at  bO^N.  However,  here  the  number  of  points  that  enter  the  optimization  (thin  line  in 
Figure  18)  has  already  dropped  considerably  down  from  its  maximum  at  30°N.  The  total 
misfit  y^at  is  therefore  only  moderately  influenced  by  the  errors  at  frO^’N  and  farther  north. 


final  error  in  MLD 


Figure  IS;  Histogram  of  the  error  of  the  modelled  MLD.  The  distribution  of  the  model  error  is 
strongly  skewed.  Most  model  values  are  too  deep.  However,  there  is  a  strong  contribution  by  the 
values  which  are  too  shallow  by  100  m  and  more. 

error  ML-temperatme 


Figure  16.  Histogram  of  the  error  of  the  modeled  mixed  layer  temperature.  The  model  is  slightly  too 
cold  (0.14  K  on  average).  Most  errors  are  smaller  than  1  K. 
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mean  errors 


Figure  17.  Seasonal  cycle  of  the  nns  error  in  MLO  and  the  mean  of  the  misfit.  The  nns  error  is 
smallest  in  May,  the  error  in  the  mean  is  highest  in  June.  Straight  lines  depict  the  annual  mean.  On 
average  the  modeled  MLD  is  too  deep  by  6.9  m. 


Figure  18.  Rms  error  of  the  MLD  in  m  (solid  line)  and  number  of  2°  by  2°  grid-cells  used  in  the 
optimization  (thin  line)  as  a  function  of  latitude.  The  error  is  very  small  at  the  equatm  and  increases 
poleward  with  maximum  values  of  180  m  at  60°N.  The  number  of  points  drops  sharply  north  of  30^ 
mainly  as  a  result  of  the  increase  in  undefined  MLD. 
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Large  errors  in  temperature  occur  in  the  same  areas  as  large  MLD  errors,  i.e.  mainly  north 
of  60°N.  There  is,  however,  no  distinct  connection  between  the  large  errors.  We  have 
recalculated  the  whole  data  assimilation  retaining  only  MLD  where  the  temperature  error 
was  below  1  K.  Nevertheless,  the  optimized  parameters  were  practically  the  same  as 
before.  The  x/  changed  by  less  than  10%,  the  total  number  of  points  was  reduced  from 
14782  to  10043  and  the  rms  error  from  48.7  to  41.3  m. 

The  improvement  in  modeling  of  the  MLD  can  be  seen  in  the  differences  in  the  misfit 
before  and  after  the  data  assimilation.  Figure  19  shows  the  probability  density  function 
(pdO  of  the  remaining  error  versus  the  initial  error.  For  practically  all  negative  errors 
(model  too  deep)  values  are  above  the  45°  line.  The  improvement  amounts  to  changes 
between  5  and  15  m  and  is  higher  for  large  errore. 

pdf*  1000 


Figure  19.  Frequency  distribution  of  final  error  in  MLD  versus  initial  error.  Most  errors  are  negative 
(model  too  deep)  with  a  maximum  of  probability  at  about  >20  m.  The  final  optimization  improves  the 
errors  by  10  m  and  more. 

Finally  we  calculate  the  error  covariance  matrix  E  according  to  (18).  As  the  dimensions  of 
the  parameters  are  different  and  their  variances  (Table  1)  have  a  different  order  of 
magnitude  we  have  chosen  to  show  the  error  correlation  matrix  (Table  2)  instead  of  E.  As 
can  be  deducted  from  the  similarities  in  the  hi  all  cross-correlations  are  negative.  That  is, 
overestimation  of  one  parameter  is  correlated  with  underestimation  of  the  others.  Most 
cross-correlations  are,  however,  very  small  with  the  exception  of  the  anti-correlation  of  the 
errors  in  and  mg- 
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Table  2.  C(xielation  matrix 


Parameter 

K 

mo 

hgim] 

u 

K 

1.000 

-0.046 

-0.046 

-0.056 

mo 

-0.046 

1.000 

-0.131 

-0.073 

hglm] 

-0.046 

-0.131 

1.000 

-0.073 

_ H _ 

-0.056 

-0.073 

-0.073 

1.000 

Conclusions 

An  inverse  method  has  been  applied  to  determine  the  free  parameters  of  a  non  linear  mixed 
layer  model.  The  method  is  successful  in  reducing  the  misfit  between  modeled  and 
measured  mixed  layer  depth.  Parameters  and  their  error  covariance  matrix  are  determined 
by  data  assimilation.  The  major  advantage  is  an  objective  test  of  different  competing  model 
formulations.  A  number  of  different  damping  n^hanisms  were  examined  and  finally  an 
exponential  decay  of  turbulent  kinetic  energy  that  scales  with  the  Ekman  depth  is  selected. 
In  the  same  way  we  tested  the  hypothesis  of  Garwood  et  al.  ( 1985a,b),  which  introduces  a 
source  of  turbulent  kinetic  energy  for  easteriy  and  a  sink  for  westerly  winds.  This  theory 
was  rejected  because  it  did  not  Et  the  data. 

An  error  analysis  is  a  necessary  step  to  determine  systematic  errors  which  cannot  be 
removed  by  parameter  optimization.  On  the  contrary,  one  has  to  be  careful  that  some  of  the 
parameters  are  not  tuned  to  alleviate  these  errors.  For  instance,  in  our  case  the  penetration 
depth  of  the  solar  heating  was  diminished  to  reduce  a  systematic  overestimation  of  the 
mixed  layer  depth.  For  the  same  reason  the  efficiency  of  wind  stirring  was  underestimated 
in  comparison  to  more  advanced  model  versions.  We  would  like  to  point  out  that  currently 
the  following  set  of  parameters  is  advised  for  OPYC: 

m<,  =  1.2,  ic=  0.4, 

/r  —  2,  Ricrit  ~  0.25, 
c  =  d  =  d'  =  0, 
e  =  1,  y  =  0.42  and 
hg  =  23  m. 
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ABSTRACT 

To  determine  the  value  of  the  adjustable  parameters  of  an  ocean  model  that  are  required  to 
optimally  fit  the  observations,  an  adaptive  inverse  method  is  developed  and  applied  to  a 
sea  surface  temperature  (SST)  model  of  the  tropical  Atlantic.  The  best-fit  calculation  is 
performed  by  minimizing  the  misfit  between  observed  and  simulated  data,  which  depends 
on  the  observational  and  the  modelization  errors.  An  adaptive  procedure  is  designed 
where  the  model  that  is  being  tuned  is  also  used  to  construct  a  sample  estimate  of  the 
observational  error  covariance  matrix.  Assuming  idealized  modelization  errors,  the 
procedure  is  applied  to  the  SST  model  of  Blumenthal  and  Cane  (1989),  yielding  improved 
estimates  for  several  model  and  heat  flux  parameters.  The  tuned  model  provides  a  better 
simulation  of  the  mean  annual  SST,  but  the  model's  ability  to  represent  the  seasonal  and 
the  interannual  variability  is  not  improved,  and  the  model-observation  discrepancies 
remain  too  large.  The  existence  of  larger  model  deficiencies  than  was  originally  assumed  in 
the  model  errors  is  confirmed  by  a  statistical  test  of  the  correctness  of  the  assumptions  in 
the  inverse  calculation. 

1  INTRODUCTION 

All  oceanic  models  contain  parameterizations  of  such  physical  processes  as  convection  and 
mixing.  Surface  forcing  also  depends  on  poorly  known  parameters.  Parameterizations  are 
based  on  physical  ideas,  but  typically  yield  forms  that  contain  parameters  whose  values  are 
not  know^i  precisely.  A  parameter  is  often  model  dependent  (e.g.,  mixing  is  a  function  of 
grid  spacing),  hence  parameter  tuning  may  be  in  part  model  dependent.  In  view  of  their 
inherent  imprecision,  the  uncertain  parameters  should  be  tuned  against  observed  data.  At 
the  same  time,  models  should  be  consistent  with  known  physics  to  within  the  tolerances 
allowed  by  the  approximations  made. 

Particularly  in  the  tropics  where  observations  are  sparse,  both  forcing  and  verification  data 
are  imprecisely  known.  Hence,  the  accuracy  to  be  expected  in  model  simulations  is  limited, 
even  if  the  physics  are  perfectly  represented,  and  data  uncertainties  should  be  taken  into 
account  in  parameter  tuning.  Frankignoul  et  al.  (1989)  have  developed  a  multivariate 
model  testing  procedure  that  provides  an  objective  measure  of  the  fit  between  ocean 
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model  simulations  and  observations,  taking  into  account  the  data  uncertainties.  By  using  a 
trial  and  error  approach,  the  method  can  be  used  for  model  tuning  (Ehichene  and 
Frankignoul,  1991;  Braconnot  and  Frankignoul,  1993).  However,  this  requires  that  the 
number  of  adjustable  parameters  is  small. 

A  more  efficient  tuning  approach  is  that  of  Blumenthal  and  Cane  (1989),  who  used  inverse 
modeling  procedures  to  determine  the  parameter  values  required  to  optimally  fit  sea 
surface  temperature  (SST)  in  a  simplified  tropical  SST  model.  A  priori  knowledge 
constraining  the  parameter  range  was  included  in  the  calculation,  but  only  a  highly 
idealized  model  was  used  for  the  data  errors.  The  error  model  enters  the  measure  of  the 
misfit  between  observed  and  predicted  data  which  is  minimized  in  the  best-fit  calculation. 
Thus,  the  atmospheric  forcing  uncertainties  need  to  be  properly  represented,  as  they 
introduce  large  uncertainties  in  the  model  response. 

As  the  forcing  uncertainties  have  large  and  poorly  known  correlation  scales,  the  error 
estimates  are  best  derived  from  direct  simulations.  We  have  thus  developed  an  adaptive 
tuning  procedure,  where  the  model  that  is  being  tuned  is  also  used  to  construct  the 
observational  error  model  for  the  best-fit  calculation.  The  tuned  model  is  then  tested 
against  observations,  and  if  it  agrees  with  the  data  to  within  expected  errors,  it  will  be 
Judged  adequate.  Such  an  adaptive  technique  combines  the  model  tuning  of  Blumenthal 
and  Cane  (1989)  and  the  model  testing  of  Frankignoul  et  al.  (1989).  Although  the 
procedure  is  developed  in  the  context  of  a  simplified  tropical  sea  surface  temperature 
model,  it  is  general  as  long  as  the  parameter  dependence  is  linear.  The  adaptive  procedure 
requires  little  computation  and  programming,  and  is  much  simpler  to  implement  than  the 
adjoint  method.  However,  since  the  effective  degrees  of  freedom  of  the  error  estimates  is 
limited  bv  the  length  of  the  sample,  the  number  of  parameters  that  can  be  tuned  is  limited. 

The  emphasis  here  is  on  the  adaptive  inverse  procedure,  although  it  is  introduced  in  the 
context  of  a  tropical  SST  model.  An  in-depth  discussion  of  the  results  is  given  in  Scoffier 
et  al.  (1993). 

2.  MODELING  SEA  SURFACE  TEMPERATURE  VARIATIONS 
a.  Ocean  model  and  surface  heat  flux 

The  ocean  model  is  that  of  Blumenthal  and  Cane  (1989,  hereafter  BC).  The  velocity  is 
predicted  with  a  linear,  multimode  equatorial  beta-plane  model  with  a  surface  mixed  layer 
of  constant  depth  h=  35  m,  which  adds  a  direct  Ekman  flow  to  the  modal  currents.  The 
model  has  five  vertical  modes,  which  are  characteristic  of  mean  tropical  Atlantic 
conditions  and  have  gravity  wave  speed  of  2.36,  1.38,  0.89,  0.69  and  0.53  m/s, 
respectively.  The  model  basin  extends  from  30°N  to  20°S  and  has  a  simplified  geometry; 
its  resolution  is  1°  in  longitude  and  0.5°  in  latitude  and  the  time  step  is  one  week.  The 
equations  are  solved  in  the  longwave  approximation,  so  that  the  model  is  only  appropriate 
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away  from  the  western  boundary  In  the  following,  we  only  consider  the  domain  in  Figure 
1,  which  should  not  be  affected  by  the  model  artificial  boundaries 

The  SST  is  uniform  in  the  mixed  layer  and  determined  from  the  net  balance  of  horizontal 
advection,  upwelling,  horizontal  difiusion,  and  surface  heat  exchanges; 

a,r 4- ua.r 4-  vd,T  +  ~-Sl>  =  ^  ^  d) 

h  "  pC^h 

where  w  is  the  vertical  velocity  at  the  mixed  layer  base  in  case  of  entrainment  and  zero 
otherwise,  k  a  horizontal  diffusion  coefficient  and  Q  the  surface  heat  flux  into  the  mixed 
layer,  and  Tj  the  temperature  below  the  mixed  layer  which  is  parametrized  as  a  function  of 
the  thermocline  depth.  As  in  BC,  the  parameterization  of  Tj  is  done  in  two  parts:  First  the 
temperature  at  the  mixed  layer  base  is  fit  to  the  depth  of  the  20°C  isotherm  in  the 
equatorial  zone  using  observations,  then  the  20°C  isotherm  depth  is  fit  to  the  model 
prediction  of  the  thermocline  depth.  The  upwelling  term  is  usually  written  as  w{T-T^), 
where  7^  is  the  temperature  of  the  water  entrained  into  the  mixed  layer,  but  the  two  forms 
are  equivalent  if 

7;  =  (l-y)r+yT,  (2) 

where  the  “entrainment  efficiency”  7  is  an  adjustable  parameter  that  should  be  less  than 
one,  because  is  between  T  and  7^ 

The  surface  heat  flux  parameterization  is  that  of  Seager  et  al.  (1988,  henceforth  SZC), 
which  was  designed  to  avoid  using  either  the  (poorly  measured)  air-sea  temperature 
differences  found  in  the  bulk  formulae  or  the  artificial  feedback  to  a  prescribed 
climatological  air  temperature  often  imposed  in  ocean  simulations.  This  parameterization 
only  makes  use  of  wind  speed  v"  and  fractional  cloud  cover  C  as  measured  variables: 

Q  =  0.94Qo(l-a^c  +  a„a)  -  pQ  L  q,iT)-ar{T-T^ ).  (3) 

The  first  term  is  the  (usual)  short  wave  radiation,  where  Qq  is  the  clear  sky  solar  flux 
reduced  by  the  effects  of  a  constant  surface  albedo  and  by  the  absorption  and  reflection  of 
the  atmosphere,  which  depends  on  C  and  solar  angle  a.  The  second  term  represents  the 
latent  heat  flux,  which  is  computed  from  the  standard  bulk  formula  using  a  fixed 
percentage  of  the  saturation  humidity  g^(7)  as  evaporation  potential;  this  assumes  that 
the  moisture  content  of  the  air  has  equilibrated  with  the  ocean  temperature,  which  is  a 
reasonable  assumption  sufficiently  far  from  the  coasts.  To  compensate  for  the  loss  of 
variability  in  using  monthly  winds,  v"  is  not  allowed  to  fall  below  4  m/s.  The  smaller 
sensible  heat  flux  and  back  radiation  are  simply  modeled  together  in  the  last  term  as  being 
proportional  to  T  minus  a  constant  reference  temperature  T^. 
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In  the  SST  equation  and  the  heat  flux  formulation,  there  are  a  number  of  parameters  that 
are  not  precisely  known,  but  were  assigned  a  “reasonable”  value  by  SZC.  Here  we 
assume  that  seven  parameters  are  adjustable  within  reasonable  ranges;  the  entrainment 
efficiency  y,  the  horizontal  diffusion  k,  and  the  heat  flux  parameters  and  Of^T^ 

in  (3),  which  we  represent  below  by  the  seven-dimensional  vector  a.  The  a  priori  values 
of  the  tunable  parameters,  denoted  by  a^,  are  those  of  SZC,  namely  y  =  0.5,  k  =  2  x  10* 
m2  s-1,  Oc  =  0.62,  aa=  0.0019,  Orh  =  0.3,  aj.=  1.5  W  m"^  K-'  and  273.15  K.  The  drag 
coefficient  for  the  wind  stress  is  not  allowed  to  vary  because  its  uncertainty  is  simulated 
explicitly. 

b.  Simulation  of  the  tropical  Atlantic  SST  climatology 

After  spin-up,  the  model  is  forced  by  a  monthly  'vind  stress  derived  from  ship  reports  for 
the  period  1964-1986  and  described  in  Frankignoul  et  al.  (1989,  henceforth  FDC).  To 
simulate  the  drag  coefficient  uncertainty,  we  follow  the  Monte  Carlo  approach  of 
Braconnot  and  Frankignoul  (1993)  and  use  five  different,  equally  plausible  drag 
coefficients  in  the  bulk  formula.  They  are  calculated  by  prescribing  a  relative  humidity  of 
80%  and  using  either  a  constant  air-sea  temperature  difference  of-l°C  (for  the 
parameterization  of  Cardone  (unpublished  manuscript)),  or  a  climatological  monthly  air- 
sea  temperature  difference  derived  from  the  COADS  data  (for  the  parameterizations  of 
Liu  et  al.  (1979),  Large  and  Pond  (1981),  Isemer  and  Hasse  (1987),  and  Smith  (1988)). 

To  avoid  smoothing,  the  monthly  mean  wind  stresses  were  corrected  to  insure  that  linear 
interpolation  on  the  model  time  step  would  not  alter  the  original  means.  Cloudiness  data 
are  of  poorer  quality,  so  that  cloud  cover  is  prescribed  from  the  monthly  climatology  of 
Esbensen  and  Kushnir  (1981),  with  an  added  normal  noise  of  0.1  standard  deviation  to 
crudely  simulate  its  short  space-time  scale  variability. 

Ignoring  the  first  year  to  eliminate  the  effects  of  the  unknown  initial  conditions,  we  have 
five  22-year  simulations  of  the  SST  whose  dispersion  is  representative  of  both  the 
interannual  variability  and  the  drag  coefficient  uncertainty.  The  mean  cycle  of  simulated 
SST  is  warmer  than  the  observations,  as  illustrated  in  Figure  1  for  January,  April,  July,  and 
October  by  a  comparison  with  the  mean  SST  over  the  same  period  calculated  from  the 
data  of  Servain  et  al.  (1985). 

The  differences  between  the  SST  predictions  and  the  observations  are  due  to  (a)  errors  in 
the  atmospheric  forcing  (wind  stress,  cloud)  and  the  SST  observations,  (b)  model 
shortcomings  due  to  over-simplification  of  the  physics,  or  (c)  poor  choice  of  the  model 
parameters.  To  assess  the  validity  of  the  SST  model,  we  must  take  (a)  into  account  and 
mittimize  (c)  by  an  optimal  tuning.  Remaining  discrepancies  should  then  point  to  the  model 
deficiencies  (b). 
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SEA  SURFACE  TEMPERATURE  (in  «C) 


Simulations  Obsarvatlons  OlHoronces 
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Figure  1 .  (left)  Mean  SST  in  °C  during  January,  April,  July  and  October  for  the  period  1%S*1986  as 
predicted  using  the  a  priori  values  of  the  model  parameters,  (center)  Corresponding  SST  as  derived  from 
the  observations  1^  Servain  et  al.  (1985).  (right)  Differences  between  simulations  and  observations. 

Root-mean-square  (rms)  SST  differences  between  model  and  observations  on  the  2°  x  2° 
grid  of  the  latter  are  given  in  Table  1  (left  column),  where  we  distinguish  between  annual 
mean,  mean  seasonal  variations  around  the  annual  mean  (hereafter  the  mean  seasonal 
variability),  and  SST  anomalies.  The  model-observation  differences  are  large,  particularly 
for  the  long  term  mean  which  is  strongly  affected  by  a  3.9°C  mean  bias. 

A  more  quantitative  estimation  of  the  model  performances  taking  into  account  some  of  the 
uncertainties  in  the  oceanic  observations  and  the  atmospheric  forcing,  as  well  as  their 
space-time  correlations,  has  been  made  for  the  mean  seasonal  cycle  obtained  with 
Cardone’s  drag  coefficient.  Following  the  multivariate  approach  of  FDC,  we  calculate  the 
misfit 


(4) 


336 


FRANKIGNOUL,  SCOFFIER,  AND  CANE 


where  T  and  TJ,  describe  the  mean  seasonal  cycle  of  modeled  and  observed  SST, 
respectively,  the  vector  space  including  all  grid  points  (on  the  observational  grid)  and  the 
twelve  months.  The  overbar  denotes  the  22-year  mean,  the  prime  denotes  the  vector 
transpose,  and  D  is  the  error  covariance  matrix  of  (T-Tq)).  In  the  calculation  reported 
here,  D  is  estimated  from  the  five  22-year  samples,  assuming  for  simplicity  that  each  year 
is  statistically  independent.  It  takes  into  account  the  uncertainties  in  the  mean  seasonal 
variations  that  are  due  to  interannual  variability,  non-systematic  observational  errors  of 
SST,  wind,  and  cloud  cover.  Not  represented  in  D  are  systematic  observational  errors 
(e  g.,  incorrect  Beaufort  scale),  drag  coefficient  uncertainty,  lack  of  high  frequency 
variability,  and  limited  resolution  of  the  wind  stress  curl.  As  the  dimension  of  the  SST  field 
is  much  larger  than  the  degrees  of  freedom  of  D,  the  misfit  (4)  is  calculated  in  a  truncated 
space  which  is  sufficiently  small  to  calculate  D  reliably  while  representing  the  main  space- 
time  patterns  of  (T  -  ) . 

Table  1 :  Rms  difference  in  °C  between  observed  and 
modeled  SST  before  and  after  tuning  in  the  20°N-10°S 
region.  The  correlation  between  observed  and  simulated 
monthly  anomalies  during  1965-86  is  given  in  italic. 


(SSTniojj-SSTobs) 

before  tuning 

after  tuning 

annual  mean 

4.0 

1.9 

seasonal  variability 

0.7 

0.8 

anomaly  correlation 

0.J3 

0.10 

If  the  SST  fields  are  multinormal,  the  null  hypothesis  that  the  model  response  to  the  true 
forcing  is  equal  to  the  true  SST  can  be  tested  because  the  test  statistic  (4)  is  then 
Hotelling's  statistic.  As  shown  in  Table  2,  is  much  larger  than  the  critical  value  at 
the  5%  level,  especially  for  the  yearly  mean  difference.  Although  only  part  of  the 
observational  errors  have  been  considered  in  the  test,  the  data  uncertainties  are  clearly 
insufficient  to  explain  all  the  model-observation  discrepancies,  which  must  be  mainly 
attributed  to  model  shortcomings  and  poor  parameter  tuning. 
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Table  2;  Misfit  between  model  and  observations  in  the 
20®N-10°S  region,  before  and  after  tuning.  The  critical 
values  for  rejecting  the  null  hypothesis  of  no  modelization 
error  are  given  for  the  5%  level  (right). 


Misfit 

before 

after 

critical 

tuning 

tuning 

value 

annual  mean 

906 

277 

4 

seasonal  variability 

2012 

1694 

73 

3.  AN  ADAPTIVE  PROCEDURE  FOR  MODEL  TUNING 
a.  Linear  model  corrections 

To  see  how  the  tunable  parameters  enter  the  SST  calculation,  it  is  convenient  to  write 
equation  (1)  in  matrix  form 

L(T)+M(T)a,=0  (5) 

where  the  vector  T  represents  temperature  at  all  the  points  in  space  and  time  where  a 
model  solution  has  been  obtained,  =  (y,  K,a^,aa,a^,ar,a/r,)  is  the  vector  of  a  priori 

parameter  values,  M(T)  and  L(T)  are  linear  operators  determined  at  all  space/time  points 
by  retaining  the  terms  of  the  model  equations  (1)  and  (3)  that  are  and  are  not  affected  by 
parameter  changes,  respectively.  Specifically,  the  row  of  L(T)  includes  the  contribution 

at  space/time  point  /  from 

aj+Ma,r+va/-o.94Q„ 

while  the  row  of  M(T)  correspondingly  represents  the  transpose  of  the  terms 

’wiT-T,)lh' 

0.94a,C 

-0.94g^a 

-pC,Lv\(T) 

T 

-1 
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Both  L  and  M  depend  on  the  atmospheric  forcing,  which  is  imperfectly  known,  so  that 
even  if  the  model  was  perfect  and  the  uncertain  parameters  optimally  chosen,  the  model 
predictions  would  differ  from  the  observations. 

Because  SST  is  a  relatively  well-measured  variable,  we  follow  BC  and  estimate  the 
“corrective  heat  flux”  5q  that,  for  the  a  priori  values  of  the  uncertain  model  parameters, 
would  be  necessary  to  make  the  model  SST  match  the  observed  SST  exactly.  To  do  so, 
we  run  the  model  using  the  observed  SST,  denoted  by  T^,  instead  of  the  calculated  one, 
after  interpolation  on  the  model  grid.  Equation  (S)  is  then  only  satisfied  by  adding  a  “heat 
flux  correction”  5q; 

UJo)  +  M(To)  ap  +  6q  =  0  (6) 

As  expected  from  the  limited  SST  agreement,  the  heat  flux  correction  Sq  is  rather  large, 
and  additional  cooling  would  be  needed  for  realistic  simulations  (Fig.2a). 

Because  8q  depends  linearly  on  the  tunable  model  parameters,  the  estimation  of  their 
optimal  value  can  be  formulated  as  the  linear  inverse  problem 

8q  =  M(To)5a,  (7) 

where  5a=(57,  8k,...,  bafTf)  represent  the  parameters  changes  that  minimize  the  heat  flux 
correction  8q ,  yielding 

(^*l)min  “  Sq  -  M(To)  5a, 

A  good  estimator  of  5a  must  take  errors  into  account,  as  well  as  our  knowledge  of  the 
expected  parameter  range. 

There  are  many  sources  of  errors  in  the  estimates  appearing  in  (7).  The  wind  stress  and  the 
cloud  data  used  to  force  the  model  have  significant  errors,  resulting  in  model  response 
uncertainties  with  large  correlation  scales,  particularly  in  the  equatorial  waveguide.  The 
observed  SST  is  noisy  as  well,  although  to  a  lesser  extent.  When  the  best-fit  calculation  is 
based  on  a  mean  seasonal  cycle  as  in  this  paper,  there  are  also  sampling  errors  which 
reflect  the  interannual  variability  and  have  large  correlation  scales.  Finally,  there  are 
“irreducible”  modelization  errors  inherent  in  the  ocean  model  formulation,  e  g.,  errors  due 
to  subgridscale  phenomena  or  to  the  oversimplification  of  the  ocean  dynamics  and  the  air- 
sea  fluxes,  which  cannot  be  expected  to  be  reduced  by  model  tuning,.  The  modelization 
errors  (called  system  errors  in  the  Kalman  filter  literature)  thus  represent  the  errors  that 
would  exist  if  there  were  no  observational  errors  and  the  uncertain  parameters  were  at 
their  true  value. 

Using  a  Bayesian  viewpoint,  Tarantola  (1987)  discusses  the  general  inverse  problem  in  the 
case  of  an  inaccurate  theory.  When  the  forward  problem  is  linear  as  in  (7)  and  there  are 
Gaussian  modelization  errors  in  M,  described  by  the  covariance  C^,  the  solution  of  the 


ADAPTIVE  INVERSE  TUNING 


339 


inverse  problem  takes  a  simple  form  if  the  observational  errors  in  5q  are  Gaussian  and 
statistically  independent  from  the  modelization  errors.  If  the  a  priori  value  of  the 
parameter  correction  5a  is  zero,  as  in  the  present  case,  the  optimal  solution  is  given  by  the 
minimum  of  the  misfit  function 

S(5a)  =  [(M5a  -  8q)'  C'  *  (M5a  -  5q)  +  5a'  C*- 1  5a]  /  2  (9) 

with  C  =  C-r  +  C4 ,  where  Cj  is  the  error  covariance  matrix  of  the  observations  5q,  and 
the  covariance  matrix  C,  describes  the  a  priori  uncertainty  of  5a.  The  solution  is 

5a  =  (M'  C-‘  M  +  M’  C  5q.  (10) 

BC  followed  this  formalism,  assuming  for  simplicity  that  the  observational  noise  only 
affected  the  model  matrix  M,  and  the  modelization  error  only  the  heat  flux  correction  5q. 
On  the  basis  of  order  of  magnitude  estimates,  they  used  a  constant  rms  error  of  35  W/m^ 
(10  W/m^)  with  a  simple  exponential  decay  for  the  total  (modelization)  errors.  There  are  a 
number  of  simplifications  in  this  approach.  As  shown  by  (6),  both  5q  and  M  depend  on 
the  input  data  (e  g.,  the  surface  wind  stress  affects  both  the  heat  exchanges  and  the  ocean 
dynamics),  hence  they  are  both  affected  by  data  uncertainties  and  modelization  errors.  The 
errors  in  5q  and  M  are  thus  not  statistically  independent,  and  the  model  matrix  really  is  a 
stochastic  regression  matrix.  Unfortunately,  ordinary  and  generalized  least  squares 
estimators  are  in  general  not  consistent  in  this  case  of  nonlinear  coupling  between  model 
and  data  errors.  The  error  models  used  by  BC  are  also  highly  idealized.  Since  the  results 
of  the  tuning  are  sensitive  to  the  assumed  error  models,  we  adopt  a  more  elaborate 
strategy  to  achieve  a  refined  estimate. 

b.  The  adaptive  procedure 

The  correlation  scales  of  the  model  response  errors  due  to  forcing  and  SST  uncertainties 
are  large  and  complex,  hence  difficult  to  represent  a  priori.  However,  they  can  be 
estimated  by  using  the  different  wind  stress  products  and  the  long  SST  time  series  of 
section  2b,  since  many  plausible  realizations  of  the  model  seasonal  response  are  available. 
We  thus  perform  the  optimization  on  the  mean  seasonal  cycle,  which  is  least  noisy,  and 
use  the  dispersion  of  the  model  seasonal  responses  as  independent  information  to 
construct  a  more  realistic  model  for  the  observational  errors. 

Assuming  that  the  parameters  do  not  vary  in  time,  we  can  write  for  each  year  /  (here  /  =  1, 
22)  and  for  each  forcing  /  (here  /  =  1,  5),  denoted  by  the  upper  index,  that  the  linear  model 
(7)  holds: 

(H) 


U^(To^ )  +  M'.^(V  )  ap  +  5q^.^  =  0. 
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Denoting  long-term  sample  means  by  an  overbar  and  the  mean  over  the  different  forcing 
by  an  angle  brace,  we  write  relation  (6)  under  the  form 

<  L(Ti )  >  +  <  M{Te)  >  a,+  <  ^  >=  0.  (12) 

The  errors  in  (1 1)  and  (12)  are  due  to  forcing  and  SST  uncertainties,  and  to  model 
inadequacies. 

Let  us  write  the  parameter  estimation  as  the  linear  statistical  model 

<5q>=<M>&+<e>  (13) 

where  <  e  >  represents  the  errors,  which  are  assumed  to  be  Gaussian,  with  zero  mean  and 
unknown  true  covariance  matrix  C.  Because  of  the  statistical  dependence  between  <  5q  > 
and  <  M  > ,  an  estimate  of  5a  is  required  before  one  may  estimate  the  random  errors  from 
the  sample.  Thus,  an  adaptive  approach  is  used,  where  the  estimates  of  the  observational 
error  covariance  and  the  model  parameters  are  updated  as  part  of  an  iterative  procedure.  If 
we  have  a  first  estimate  of  5a,  say  5a(,^  which  we  will  take  equal  to  zero,  then  we  can 
estimate  for  each  year  t  the  mean  error  over  the  different  forcing,  <e,^  >,  by 

*^e/>  =  <5q^>-<M^>  5ao.  (14) 

A  first  sample  estimate  of  the  error  covariance  matrix  associated  with  the  random  wind, 
cloud  and  SST  errors  is 

1 

^  ^ 

where  we  have  assumed  for  simplicity  that  observations  are  independent  at  yearly 
intervals.  We  can  also  estimate  for  each  forcing  i  the  long-term  mean  error,  ^  by 

ii'  =^’-M'58 

and  a  first  sample  estimate  of  the  error  covariance  matrix  associated  with  the  drag 
coefficient  uncertainties  is 

Sfi  =  e;  > - <  e,  >)' 

4Xj 

A  first  sample  estimate  of  the  error  covariance  associated  with  the  observational 
uncertainties,  say  Sjj,  can  then  be  obtained  by 

S«  -  S„  +  s„ 

and  it  can  be  used  to  compute  an  estimated  generalized  least  squares  estimate  of  5a,  say 
5a,.  As  in  (10),  we  incorporate  the  modelization  errors  and  our  a  priori  knowledge  on  the 
model  parameters. 


(15) 


(16) 


(17) 
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&, = (M's;‘  M+CiM's;'  &|, 

with 

Si  ~  Sji  +  Cj. 

The  procedure  is  repeated  by  using  Stj  in  (14)  to  get  an  improved  estimate  S,,  leading  to 
the  parameto-  correction  Sa,,  and  so  on.  If  Sa«  represents  a  reasonable  first  guess  and  if 
the  inverses  in  (18)  are  well-conditioned,  the  procedure  should  converge  raindly.  The  end 
result  is  a  data  error  structure  consistoit  with  the  results  of  the  multi-year  modd  run,  and 
thus  presumably  a  better  parameter  estimation. 

The  error  model  S.  represents  most  of  the  nonsystematic  data  and  nnodd  errors;  it  also 
includes  such  data  errors  as  artificial  trends  in  wind  and  SST  data.  The  true  interannual 
variability  is  not  treated  as  an  error  since  it  appears  in  both  5q^  and  in  (14).  The 
weighting  in  the  least  squares  fit  is  therefore  based  on  data  noise  and  uncertainties  and  it 
takes  into  account,  at  least  approximately,  the  lack  of  independence  between  M  and  Sq. 
On  the  other  hand,  the  weighting  is  not  affected  by  the  systematic  errors  that  recur  every 
year;  model  deficiencies,  or  systematic  data  biases,  must  be  dealt  with  explicitly. 

Because  of  the  limited  sample,  the  error  covariance  matrix  is  of  strongly  reduced  rank 
and  the  inverse  of  S.  dominated  by  unreliable  information.  Hence,  the  problm  is  ill- 
conditioned.  To  circumvent  the  difficulty,  we  strongly  reduce  the  dimension  of  the  fields 
and  tune  the  model  in  the  highly  truncated  space.  The  iterative  method  is  implemented  in 
reduced  space:  for  each  forcing,  each  individual  year  is  projected  onto  the  reduced  base, 
ther^y  defining  a  reduced  heat  flux  correction  and  a  reduced  model  matrix .  By 
projection,  a  reduced  modelization  error  matrix  is  also  constructed.  The  saitq>le  error 
covariance  matrix  associated  with  the  observational  uncertainties  and  the  optinnal 
parameter  corrections  are  then  directly  calculated  in  reduced  space,  so  that  the 
computational  costs  are  very  limited. 

c.  Model  testing 

The  correctness  of  the  SST  model  and  the  main  assumptions  in  the  inverse  calculation 
(e.g.,  modelization  and  data  errors)  can  be  checked  by  looking  at  the  residuals  after 
optimization,  but  this  ignores  useful  information  on  corrdation  scales.  To  take  tl^ 
multidimensional  aspects  of  the  fields  into  account,  we  generalize  a  multivariate  test 
derived  by  Tarantula  (1987)  and  consider  the  minimum  of  the  misfit  function  (9),  givoi  by 

2S  (5a, )  =  ^  (MC„  M' + S. )-‘  ^ 


(18) 

(19) 


(20) 
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with  S,  =  +  C^.  The  null  hypothesis  that  the  only  errors  besides  the  observational  ones 

are  the  modelization  errors  can  be  tested  since  the  test  statistic  (20)  is  distributed  as 
Hotelling's  with  degrees  of  freedom  rj  (the  reduced  dimension)  and  z  (the  equivalent 
degrees  of  freedom  of  SJ.  If  (20)  exceeds  the  critical  value  at  a  given  level  of  confidence, 
then  some  of  the  assumptions  are  unlikely  to  be  acceptable.  Since,  except  for  possible 
biases,  the  observational  uncertainties  are  represented  by  an  error  model  which  is,  by 
construction,  consistent  with  the  available  observations,  the  most  likely  interpretation  is 
that  the  model  is  not  as  accurate  as  it  has  been  assumed,  i.e.,  the  modelization  errors  have 
been  underestimated. 

4.  TUNING  THE  TROPICAL  ATLANTIC  SST  MODEL 

The  monthly  values  of  8q^’^  and  are  first  spatially  smoothed  with  a  5°  x  5°  rutming 

average.  The  fit  is  then  done  in  the  region  between  10°S  and  20®N,  by  considering 


Corrective  Hux 


Upwelling  flux 


Horizontel  mixing 


Figure  2a.  (left)  Mean  heat  flux  correction  in  Wm'^  during  January,  April,  July  and  October  for  the  period 
1%5-1986,  when  using  the  a  priori  values  of  the  model  parameters.  Corresponding  values  of  (center)  the 
upwelling  flux  and  (right)  horizontal  difiusion. 


ADAPTIVE  INVERSE  TUNING 


343 


January,  April,  July,  and  October,  which  are  representative  of  the  various  SST  regimes. 
The  data  dimension p  is  322  x  4  =  1288. 

The  mean  heat  flux  correction  <  Sq  >  is  represented  in  Figure  2a.  The  rms  value  is  large 
(69  Wm‘2),  and  negative  values  in  excess  of -100  Wm'*  are  found  off  Ainca  and  in  the 
Gulf  of  Guinea,  mostly  where  the  largest  SST  differences  are  observed.  The  tuning  can  be 
viewed  as  determining  the  best  fit  of  the  heat  flux  correction  vector  in  Figure  2a  by  the 
seven  column  vectors  of  <  M(Tj)  >,  which  are  represented  in  Figures  2a,b  (units  are 
arbitrary).  The  upwelling  pattern  (Fig.  2a,  center)  has  a  large  signal  in  the  Gulf  of  Guinea 
with  maximum  amplitude  during  the  upwelling  season  in  July;  a  smaller  signal  is  seen  in 
the  ITCZ  with  maximum  amplitude  off  Afnca,  except  in  April.  The  meridional  scaled  of 
the  diffusion  pattern  (Fig.  2a,  right)  is  slightly  smaller  than  that  of  upwelling.  The  cloud 
pattern  cloud  pattern  (Fig.  2b,  left)  has  broader  scales  and  its  seasonal  changes  reflect 


Shortwave  cloud  dependence  Solar  angle  dependence 


Latent  heating 


Figure  2b.  (left)  Cloud  factor  values  during  January,  April,  July  and  October  for  the  period  1%5-1986, 
when  using  the  a  priori  values  of  the  model  parameters,  (center)  solar  angle,  and  (ri^t)  latent  heat  flux. 
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those  of  Qq  and  C.  The  evaporation  pattern  (Fig.  2b,  right)  has  a  large  meridional  scale 
and  strong  zonal  gradients.  Additional  patterns  are  the  insolation  pattern  in  Figure  2b 
(right),  a  constant,  and  the  observed  SST  pattern  shown  in  Figure  1 . 

The  data  compression  is  done  by  working  in  the  space  defined  by  orthonormalizing  the 
eight  vectors  consisting  of  <  Sq  >  and  the  seven  column  vectors  of  <  M  > .  As  the 
dimension  of  the  subspace  is  the  number  of  adjustable  parameters  plus  one,  the  inverse 
problem  remains  formally  overdetermined.  As  described  in  section  3,  Sq^>^  and  are 
projected  onto  the  reduced  base  for  each  year  /,  and  the  sample  error  covariance  matrix 
directly  estimated  in  reduced  space  at  each  iteration  n.  Because  has  limited  degrees  of 
freedom,  its  elements  are  inaccurately  known  (large  sampling  errors)  and  the  condition 
number  of  the  matrix  S.  is  very  large.  Lacking  precise  information  on  the  modelization 
errors,  we  use  EC's  model,  but  double  the  rms  error  to  20  Wm"^.  This  modelization  error 
matrix  is  not  sufficient  to  insure  good  conditioning,  so  a  singular  value  decomposition  is 
used  to  invert  S„  in  (18).  In  practice,  we  apply  a  taper  which  is  an  estimate  of  the  accuracy 
of  the  elements  of  Sj„. 

For  simplicity,  we  use  zero  for  the  parameter  correction  5a<,,  but  the  results  are  similar 
when  using  a  different  initial  value.  Convergence  is  reached  in  two  or  three  iterations,  with 
the  largest  changes  occuring  after  the  first  iteration.  Figure  3  shows  the  a  priori  and  a 
posteriori  values  of  the  adjustable  parameters  with  twice  their  standard  deviation  (an 
approximation  to  the  95%  confidence  interval).  Of  the  seven  adjustable  parameters,  two 
strongly  decrease  to  values  that  are  positive,  but  not  significantly  different  from  zero  at  the 
5%  level;  the  upwelling  efficiency  y  and  the  horizontal  diffitsion  k.  Both  parameters  are 
well-resolved  by  the  data  set  and  independently  resolved.  However,  such  a  small  value  for 
the  upwelling  efficiency  is  unlikely  from  a  physical  point  of  view.  Although  the  changes  in 
the  cloud  factor  and  the  latent  heat  flux  are  also  well-resolved,  they  are  not 
statistically  significant  at  the  5%  level,  which  suggests  that  the  a  priori  choices  were  good, 
needing  only  little  adjustment.  However,  the  two  parameters  are  not  independently 
resolved  and  are  anticorrelated,  and  correlated  with  the  three  remaining  parameters,  a^,  Oj 
and  Of  which  are  poorly  resolved  by  the  data  set. 

Figure  4  shows  the  heat  flux  correction  (8)  after  tuning.  The  amplitudes  are  smaller  than  in 
Figure  2;  the  rms  value  has  dropped  to  32  Wm"^  and  the  space-time  average  to  -7  Wm"^ , 
suggesting  that  the  warm  SST  bias  in  Figure  1  should  be  mostly  corrected.  However,  heat 
flux  corrections  larger  than  100  Wm*^  can  still  be  seen  off  the  North  Afiican  coast  during 
winter  and  in  the  equatorial  upwelling  region  during  sununer.  These  are  too  large  to  be 
explainable  by  the  data  uncertainties  and  are  associated  with  model  deficiencies,  as 
discussed  by  BC  and  Scoffier  et  al.  (1993). 
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To  verify  the  consistency  of  the  inverse 
calculation,  we  apply  the  test  of  section  3c. 
Although  the  critical  value  of  the  test 
statistic  (20)  is  difficult  to  establish  as  the 
total  error  covariance  is  the  sum  of  a 
sample  one  and  an  (assumed  to  be)  true 
one,  upper  and  lower  bounds  can  easily  be 
found.  For  true  covariances,  the  critical 
value,  given  by  the  distribution  with  8 
degrees  of  freedom  (the  dimension  of  the 
space),  would  be  16  at  the  S%  level  (lower 
bound).  For  sample  covariance  matrices,  it 
would  be  given  by  Hotelling's  and  equal 
to  32  (upper  bound).  The  test  is  385, 
which  largely  exceeds  the  latter  value.  This 
confirms  that  the  modelization  errors  have 
been  strongly  underestimated.  In 
particular,  there  are  large  modelization 
biases,  not  only  random  modelization 
errors  as  assumed. 


Figure  3.  Evolution  of  the  parameter  corrections 
as  a  function  of  the  number  of  iterations.  The 
error  bars  represent  the  95  %  confidence 
intervals. 
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SEA  SURFACE  TEMPERATURE  (in  ®C) 
Conective  flux  (in  Wm*^)  Simulations  Diffaroncas 


Figure  4.(left)  Mean  heat  flux  correction  in  Wm'^  during  January,  April,  July  and  October  for  the  period 
1965-1986,  after  optimization;  (center)  Corresponding  SST  predictions;  (right)  Differences  between 
simulated  and  observed  SST. 


Because  the  tuning  minimizes  the  heat  flux  correction  (more  precisely  a  weighted  form  of 
it),  it  is  of  interest  to  verify  whether  the  SST  predictions  have  been  improved  by  the 
parameter  changes.  The  tuned  model  was  thus  run  with  the  same  forcing  fields  as  before. 
As  expected,  a  more  realistic  SST  field  is  obtained  (Fig.  4,  center),  although  model- 
observation  differences  of  a  few  degrees  can  still  be  seen  in  the  upwelling  region  off  Ainca 
during  the  first  part  of  the  year  and  in  the  Gulf  of  Guinea  during  the  second  part  (Fig.  4, 
right).  Tables  1  and  2  suggest  that  the  model  improvements  are  limited  to  a  decrease  of 
the  warm  SST  bias,  although  it  still  averages  to  1.5°C.  The  mean  seasonal  variability  and 
the  observed  SST  anomalies  are  not  significantly  improved,  so  that  the  SST  model  remains 
largely  inconsistent  with  the  observations;  the  tuning  is  unable  to  compensate  the  model 
shortcomings. 

The  method  is  not  very  sensitive  to  the  details  in  the  calculation.  The  largest  parameter 
corrections  were  obtained  when  working  with  low-passed  seasonal  data,  because  filtering 
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decreases  the  magnitude  of  the  observational  and  modelization  errors,  thereby  giving  more 
weight  to  the  observations  in  the  best-fit  calculation.  Unfortunately,  the  increased 
resolution  by  the  data  set  leads  to  vanishing  upwelling  efficiency,  which  is  not  acceptable 
Although  a  larger  upwelling  efficiency  could  be  obtained  by  constraining  more  y,  this 
stresses  the  inadequacy  of  the  upwelling  representation  for  the  tropical  Atlantic. 

5.  CONCLUSIONS 

We  have  developed  an  adaptive  inverse  method  to  tune  the  adjustable  parameters  of  a 
tropical  SST  model  in  a  way  that  optimally  takes  into  account  the  large  uncertainties  of 
the  atmospheric  forcing  and  the  oceanic  data,  the  expected  modelization  errors  and  our 
a  priori  knowledge  of  the  parameter  values.  This  is  achieved  by  performing  the  model 
optimization  for  the  mean  seasonal  SST  cycle  and  using  the  dispersion  of  the  model 
responses  for  each  year  and  (equally  plausible)  forcing  field  as  independent  information  to 
construct  a  sample  estimate  of  the  observational  error  covariance  matrix.  The  procedure  is 
more  refined  than  that  of  BC  in  that  the  nonlinear  nature  of  the  inverse  problem  is  taken 
into  account  and  the  large  correlation  scales  of  the  forcing  uncertainties  are  represented 
realistically.  The  method  is  general  as  long  as  the  parameters  enter  the  SST  equation 
linearly,  and  it  can  be  extended  to  the  nonlinear  case  by  using  an  iterative  approach.  Since 
the  optimization  is  performed  in  a  strongly  reduced  space,  the  computational  cost  is 
limited.  However,  the  estimation  of  the  observational  errors  requires  that  several  multi¬ 
year  model  runs  be  available. 

The  method  has  been  applied  to  tuning  EC's  SST  model  of  the  tropical  Atlantic.  The 
optimization  reduces  the  warm  SST  bias  of  the  model,  but  brings  no  significant 
improvement  in  its  ability  at  representing  the  seasonal  or  interannual  SST  fluctuations.  A 
statistical  test  of  the  correctners  of  the  assumptions  in  the  inverse  calculation  shows  that 
the  modelization  errors  are  much  larger  than  assumed.  The  model  flaws  are  discussed  in 
Scoffier  et  al.  (1993),  who  show  that  the  model's  inability  to  properly  represent  SST 
cooling  by  upwelling  is  linked  to  the  parameterization  of  in  (1)  and,  as  seen  in  Figure  2a 
(center),  may  result  in  SST  heating  by  upwelling  when  the  SST  is  low  and  the  thermocline 
deep,  which  is  not  realistic. 

Finally,  it  should  be  noted  that  the  adaptive  tuning  procedure  provides  an  alternative  to 
imposing  the  “correction  flux”  that  is  often  needed  to  avoid  climate  drift  when  coupling  an 
SST  model  to  an  atmospheric  model.  Indeed,  the  decrease  in  mean  SST  bias  should 
decrease  climate  drift  in  the  coupled  mode  without  introducing  the  drawbacks  of  the 
correction  flux  method,  because  the  correction  more  properly  takes  place  via  model 
parameters,  without  altering  the  SST  dynamics. 
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NEW  DEVELOPMENTS  IN  STIRRING  AND  CHAOS: 
POSSIBLE  ROLE  IN  OCEAN  SCIENCES 
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R.  R.  McCormick  School  of  Engineering  and  Applied  Science,  Northwestern  University 
Evanston,  Illinois  60208-3120 

1.  Introduction  and  Setting 

Getting  asked  to  comment  outside  one’s  area  is  both  flattering  and  healthy.  However,  the 
intersection  between  what  one  might  know  and  what  people  might  like  to  hear — especially 
when  one  cannot  accurately  gauge  the  needs  of  an  audience  technically  far-removed  from 
one’s  own — might,  in  fact,  be  remarkably  small.  In  spite  of  having  heard  much  about 
oceanography  during  the  ‘Aha  Huliko‘a  Hawaii  workshop  held  in  January  1993,  such  still 
might  be  my  predicament  in  this  particular  case.  My  role  here  is  to  present  a  view  of 
mixing  and  chaos  theory  and  indicate  what  relevance  it  might  have  in  problems  of  interest 
in  oceanography.  My  assumption  is  that  the  reader  is  at  least  vaguely  familiar  with  some 
aspects  of  dynamical  chaos. 

It  probably  has  not  escaped  anybody’s  attention  that  during  the  past  few  years  there  has 
been  considerable  interest  in  chaos.  The  theoretical  foundations  of  the  subject  are  on  firm 
footing  and  demonstrations  of  chaos  have  been  firmly  established  by  analytical, 
computational,  and  experimental  means.  So  much  has  been  the  bulk  of  the  work  generated 
that  hardly  a  month  goes  by  without  a  book  being  published  and  at  the  last  count  there  were 
at  least  half  a  dozen  journals  largely  devoted  to  the  topic.  The  collective  impact  of  the  body 
of  work  so  generated,  with  no  apparent  signs  of  slowing  down,  can  be  compared  to  the 
emergence  of  a  new  paradigm.  Regrettably,  as  in  any  emerging  area,  sometimes  to  the 
chagrin  of  its  creators,  there  is  some  degree  of  overshoot  and  less  than  guaranteed 
unbounded  enthusiasm.  Not  everything  that  claims  to  be  useful  is  likely  going  to  pass  the 
test  of  time,  but  it  is  also  doubtful  that  no  permanent  mark  will  be  left.  Undoubtedly,  the 
way  that  people  will  be  educated  will  change  (in  fact,  it  is  already  changing;  college  physics 
textbooks  now  have  sections  devoted  to  chaos).  A  non-trivial  consequence  of  this  trend  is 
that  data  that  could  have  been  discarded  a  decade  or  so  ago  as  being  unanalyzable  will  be 
scrutinized  in  the  future  in  more  detail  for  trends  and  patterns. 

The  most  intuitively  understandable  definition  of  chaos  is  magnification  of  small  errors  and 
the  impossibility  of  making  predictions  for  long  times.  This  statement — so  often 
repeated — has  produced  the  impression  that  chaotic  systems  cannot  be  predicted  at  all. 
Strictly  speaking  this  is  far  from  being  true.  What  cannot  be  predicted  is  the  detailed 
evolution  of  a  specific  initial  condition.  The  behavior  of  the  system  at  large — ^that  of  a 
multitude  of  initial  conditions — may  be  quite  robust,  and  this  is,  in  fact,  what  matters  in 
many  situations  of  practical  interest.  As  we  shall  see,  a  particularly  important  example  is 
provided  by  mixing  of  fluids. 
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Claims  for  ^plications  of  chaos  theory  abound.  In  fact,  in  the  context  of  dynamical 
systems,  the  essence  of  chaos  seems  to  be  the  exception  rather  than  the  rule.  The 
applications,  however,  have  been  largely  a  posteriori;  that  is,  explaining  existing  (complex) 
behavior  and  demonstrating  that  the  complexity  stems  from  an  underlying  deterministic 
cause.  Much  less  has  been  done  on  the  predictive  side;  using  theory  to  predict  the  state  of 
systems  for  long  times.  It  is  apparent  that  there  might  be  a  need  for  both  types  of  works  in 
the  context  of  oceanography;  interpretation  of  seagoing  data  being  in  the  first  class, 
prediction  based  on  available  information  being  the  second.  An  alternative  breakdown 
might  divide  the  tasks  between  analysis  of  observational  data  on  one  side  and  analysis  of 
output  from  numerical  models,  such  as  general  circulation  models,  on  the  other. 

The  objective  of  this  article  is  to  provide  a  brief  overview  of  some  of  our  past  work  on 
mixing  and  chaotic  advection  including  a  few  remarks  not  made  before.  However,  in  order 
to  accomplish  this  objective  and  setting  things  in  perspective,  a  number  of  remarks 
pertaining  to  general  aspects  of  chaos  theory  will  be  made.  As  there  are  a  large  number  of 
references  for  this  material,  no  review  is  attempted.  The  second  part  of  the  presentation 
involves  issues  in  chaotic  advection.  One  general  reference  on  this  topic  is  availabk 
(Ottino,  1989a)  and  an  introductory  review  to  chaotic  mixing  is  given  in  Ottino  (19»9b). 
Several  other  reviews  are  available  (Aref,  1991;  the  entire  issue  of  Physics  of  Fluids  A,  3, 
May  1991,  is  entirely  devoted  to  stirring  and  mixing). 

2.  Dynamical  Chaos:  Brief  Review  of  Essential  Concepts 

During  the  past  few  years  there  has  been  a  realization  that  nonlinear  dynamical  systems  are 
able  to  display  a  variety  of  what  superficially  might  be  regarded  as  two  contradictory — but, 
in  fact,  perfectly  coexisting — behaviors.  On  one  hand  the  output  can  be  order  (e.g. 
solitons),  on  the  other  it  can  be  chaos  (See  Fig.  1).  Often  a  system  exhibits  both  behaviors 
simultaneously. 
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Figure  1 .  Overview  of  dynamical  systems  and  definitions  of  chaos. 
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In  the  context  of  our  discussion  dynamical  systems  are  given  by  sets  of  cmlinaiy 
differential  equations  (ODEs)  or  maps 

^  =  f(x,p),x,^;  =  g(x„p),  (1) 

at 

where  x=(xi,...,Xk)  with  k^l.  In  ortter  for  the  system  to  be  chaotic  k^3  for  ODEs,  k^l  fw 
maps.  The  components  of  the  vector  x  might  have  either  a  transparent  physical  meaning  or 
not  according  to  the  problem  in  question.  For  example,  in  chaotic  advection  x  denotes  the 
actual  physical  space  but  in  problems  where  the  set  of  ODEs  is  arrived  at  through 
truncation,  as  in  the  case  of  Lorenz’s  equations,  the  variables  have  a  less  transparent  one. 
The  space  spanned  by  x  is  called  the  phase  space  of  the  system  and  p,  or  a  set  of  p’s,  are 
parameters  such  as  the  Rayleigh  number  or  Reynolds  number.  In  the  case  of  systems 
described  by  partial  differential  equations  (PDEs)  the  number  of  degrees  of  freedom  is 
infinite  and  therefore  the  phase  space  is  inHnite  as  well.  According  to  the  form  of  f(x), 
more  precisely  the  sign  of  V  •  f  (x),  we  can  speak  of  two  kinds  of  systems.  In  one  class 
there  is  volume  contraction  in  phase  sp£a:e  (V  •  f(x)<0);  these  are  dissipative  systems.  The 
other  kind  of  systems  are  those  that  conserve  volume  in  phase  space  ( V  •  f(x)=0),  and  of 
those  the  most  important  sub-class  is  given  by  the  so-called  Hamiltonian  systems  [a 
system  can  be  volume  preserving  and  not  be  Hamiltonian;  however,  if  it  is  Hamiltonian  it 
is  volume  preserving].  The  prototypical  example  of  a  dissipative  system  is  the  forced 
pendulum  with  friction;  the  prototypical  Hamiltonian  system  is  a  forced  pendulum  without 
friction.  The  bulk  of  the  presentation  here  will  be  restricted  to  volume  preserving  systems. 
However,  in  order  to  place  the  topic  in  perspective  a  few  remarirs  pertaining  to  dissipative 
systems  might  be  in  order  (for  mathematic^  presentations  of  dynamical  systems  see 
Guckenheimer  and  Holmes,  19  -  and  Wiggins,  1991;  for  a  collection  of  classical  p^rs 
the  reader  can  consult  Hao,  1984;  an  accessible  introduction  to  chaos  in  both  dissipative 
and  non-dissipative  systems  is  given  by  liberty  and  Ottino,  1988). 

Dissipative  systems  are  typically  associated  with  one  dimensional  nuq)s  (such  as  the 
logistic  equation,  volume  contracting  systems  of  ordinary  differential  equations — such  as 
in  the  Lorenz  equations — and  strange  attractors  characterized  by  fractal  dimensions.  If  the 
model  is  continuous,  a  dissipative  system  must  consist  of  at  least  three  (autonomous) 
ordinary  difrerential  equations  in  or^r  to  exhibit  chaos  (as  in  the  Lorenz  model).  On  the 
other  hand,  if  the  model  is  represented  by  a  mapping  Xn+i=  g(Xn,  p),  it  can  display  chaos  in 
one  dimension,  i.e.  with  Xn  being  real  (as  in  the  logistic  equation).  By  contrast,  a  volume 
preserving  mapping  must  be  at  least  two-<limensional  to  be  chaotic.  As  opposed  to 
dissipative  systems,  Hamiltonian  systems  have  no  stable  steady  states,  the  phase  space 
does  not  contract,  and  there  are  no  attractors,  strange  or  otherwise.  Dissipative  and 
Hamiltonian  systems  have  their  own  ways  of  “going  chaotic.”  However,  both  types  of 
systems  have  a  few  things  in  common.  One  of  the  connections  is  a  stretching-and-folding 
mechanism  in  phase  space;  this  is  what  might  be  regarded  as  the  basic  mechanism  leading 
to  amplification  of  errors  in  chaotic  systems. 
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The  closest  applications  related  to  oceanography  are  probably  those  in  meteorology  (for 
reviews  see  Tsonis  and  Elsmer,  1989;  Yang,  1991;  Zeng  at  al.,  1993),  a  well-known 
application  being  the  model  of  El  Nino-Southem  Oscillation  system  (Vallis,  1986). 
Literature  in  this  area  appears  voluminous  when  compared  with  oceanographic 
applications.  When  averaged  across  all  fields  probably  over  95%  of  the  current 
applications  of  chaos  involve  dissipative  systems.  In  meteorology  the  ratio  is  close  to 


100%. 


The  first  question  that  should  be  asked  when  facing  a  complex  system  or  signal  is  to 
determine  if  it  is  stochastic  or  chaotic  (Sigeti  and  Horsthemke,  1987).  If  the  process  is 
indeed  chaotic,  the  next  task  is  to  determine  whether  or  not  it  possesses  a  strange  attractor, 
the  hope  being  that  no  matter  how  large  the  original  system  might  be,  the  dynamics  might 
be  captured  by  the  motion  in  a  subspace  of  much  smaller  dimension  (Fig.  2).  These 
reconstruction  techniques  can  be  based  on  the  measurement  of  one  or  more  components  of 
the  vector  x  (Packard  et  al.,  1980;  Wolf  et  al.,  1985).  Subsequently,  the  “amount  of  chaos” 
in  the  projection  of  the  attractor  can  be  characterized  by  determining  its  dimensions,  by 
measuring  one  or  more  Lyapunov  exponents,  and  so  forth  (for  a  practical  application  of 
these  ideas,  see  Parker  and  Chua,  1989).  Naturally,  there  are  instances  when  the  analysis 
starts  with  the  equations  themselves  (for  example  the  Navier-Stokes  equations  in  a 
problem  in  fluid  mechanics).  However,  in  many  cases  the  equations  are  unmanageable 
and  they  have  to  be  transformed  in  a  way  that  is  suitable  for  analysis.  This  is  where  the 
issue  of  representing  a  PDE  in  terms  of  finite  degrees  of  freedom  appears.  The  most 
famous  example  belonging  to  this  class  is  the  reduction  of  the  Rayleigh-B6nard  flow 
problem  to  the  Lorenz  equations  (Lorenz,  1962).  A  question  in  this  case  is  whether  the 
chaos  that  is  seen  in  the  3x3  truncated  system  would  actually  ^pear  in  the  full  problem  or 
not.  This  issue  was  studied  by,  among  others,  Wiin-Nielson  (1992).  The  answer,  not 
surprisingly,  is  that,  yes,  the  details  of  the  process  might  depend  heavily  on  the  number  of 
equations  considered  and  that  extreme  care  should  be  exercised  in  extr^lating 
conclusions  outside  the  range  of  applicability  of  the  equations. 
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Figure  2.  Typical  modes  of  analyses  of  chaotic  systems. 

In  the  viewpoint  advocated  here  truncation  is  not  an  issue.  The  viewpoint  adopted  is  a 
purely  kinematical  one  which  is  only  suited  to  the  analysis  of  fluid  mechanical  issues. 
However,  to  the  extent  that  oceanography  is  routinely  faced  with  such  issues,  this  does  not 
seem  to  be  a  terribly  important  drawback.  The  dynamical  system  is  the  velocity  field  itself. 
An  impoitant  fringe  benefit  of  this  approach  is  the  rather  transparent  connection  between 
the  underlying  mathematics  and  their  associated  physical  meaning. 

3.  Chaotic  Advection:  Kinematics 

The  study  of  mixing  begins  with  the  analysis  of  the  motion  due  to  an  imposed  velocity 
field;  i.e.,  the  study  of  the  dynamical  system 

^  =  »(x,/)  (2) 

dt 

where  v(x,f)  is  usually  obtained  by  solution  of  the  Navier-Stokes  equations  and  is  volume 
preserving  (i.e.,  V  ■  v=0).  The  solution  of  (2)  with  the  initial  condition  that  x=X  at  f=0: 

x(r)  =  ^(X,f)  such  that  X  =  0(X,O)  (3) 

This  solution  is  called  the  flow  or  motion.  Although  traditional,  and  probably  by  now 
unchangeable,  it  should  be  noted  that  Eqns.  (2-3)  represent  an  abuse  of  notation.  The 
variable  x  has  two  meanings  that  can  be  inferred  according  to  context.  In  the  first  one,  as  in 
the  right  hand  side  of  (2),  x  represents  a  fixed  position  in  space;  in  the  second  one,  as  in  the 
left  hand  side  of  (3),  x  represents  the  position  of  particle  X  at  time  t.  Note  also  that  it  is 
common  to  refer  to  a  specific  fluid  particle  as  “particle  X,”  when  in  fact  we  mean  the  fluid 
particle  that  was  initially  located  at  position  X.  Equations  written  in  terms  of  X  are 
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refened  to  as  Lagrangian;  equations  written  in  terms  of  x  are  referred  to  as  Eulerian.  These 
two  viewpoints  are  classical  in  fluid  mechanics.  The  key  idea  in  chaotic  advection  is  that 
whereas  v(x,/)  might  be  simple,  v(X,r)  can  be  extremely  con^)licated. 

The  traditional  characterization  of  velocity  Helds,  usually  v(x,/),  is  in  terms  of  streamlines, 
streaklines,  and  pathlines.  A  graph  of  equation  (3)  for  a  single  X,  with  t  as  a  parameter, 
gives  the  pathline  of  particle  X.  The  streamlines  corresponding  to  the  velocity  field  v(x,/) 
at  time  t  is  the  solution  of  dx/ds  =  v(x,f),  where  5  is  a  parameter  and  t  is  fixed.  The 
streakline  passing  through  x'  at  time  /  is  the  locus  of  all  particles  which  passed  through  x' 
during  the  interval  0  to  t.  Physically,  this  corresponds  to  the  curve  traced  out  by  a  non- 
diffusive  dye  which  is  injected  at  x*. 

A  description  of  a  velocity  field  in  terms  of  streaklines  and  pathlines  represents  a  nearly 
complete  characterization  of  the  flow.  However,  analytical  exanqjles  of  streamlines, 
streaklines,  and  pathlines  are  rare  unless  the  flows  happen  to  be  trivial.  The  reason  has  to 
do  with  the  fact  that  many  solutions  are  chaotic  and  therefore  cannot  possibly  be  written 
down.  In  all  the  examples  considered  here  the  velocity  fields  are  two  dimensional  and  time 
periodic.  It  should  be  pointed  out  that  steadiness  does  not  preclude  chaos.  The  velocity 
field,  however,  has  to  be  three  dimensional  for  this  to  occur. 

The  most  studied  case  of  chaotic  advection  corresponds  to  time-periodic  velocity  fields.  A 
time-periodic  flow  can  be  regarded  as  a  composition  of  motions  or,  equivalently,  the 
iteration  of  a  map.  A  few  remaiks  regarding  the  composition  of  motions  seem  in  order, 
because  there  are  subtle  points  which  are  often  misunderstood.  When  two  different 
motions,  and  follow  each  other,  they  can  be  composed  as 

(4) 

Here,  the  first  motion  acts  for  time  r^,  and  the  second  motion  acts  for  time  r^.  It  is 
understood  that  the  final  position  of  the  particle  after  completion  of  the  first  motion 
constitutes  the  initial  position  for  the  particle  for  the  second  motion.  In  general  this  is  not 
equivalent  to 


(5) 

In  this  case,  the  first  motion  acts  for  time  tb,  and  the  second  motion  acts  for  time  tg. 

Even  composing  a  single  flow  with  itself  can  be  a  bit  subtle.  A  straightforward 
composition  of  flows,  i.e.,  transforming  X  with  ^  for  r,  and  then  transforming  again  for  r 
yields,  in  general,  incorrect  results.  This  occurs  because  the  velocity  field  is  time 
dependent.  When  the  Eulerian  velocity  field  is  unsteady,  it  matters  not  only  where  a 
particle  is  located,  but  when  it  is  found  there.  By  contrast,  when  the  velocity  field  is  time 
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independent,  a  flow  may  be  composed  with  itself,  and  the  composition  is  also 
commutative,  i.e.. 


x{t  +  t)  =  ^0(X,  t),t]  =  ^^(X,t),  t]  =  ^(X,/  +  T)  (6) 

If  the  velocity  field  is  time  periodic,  i.e.,  v(x,f)  =  v(x,/+7),  then  a  flow  may  be  composed 
with  itself,  but  only  for  an  amount  of  time  which  is  an  integer  multiple  of  the  period  of  the 
velocity  field: 


x(r  +  r)  =  ^(^(X,r),r)  (7) 

x(r  +  nr)  =  0(0(0...0(X,r)...,r),7),/)  =  ^(^(X.nnO  (8) 

Flows  due  to  a  time  periodic  velocity  field  are  frequently  written  as  a  mapping: 

(9) 

Customarily,  the  parenthesis  around  Xn  are  omitted.  In  mapping  notation,  usually  the  initial 
particle  position  is  denoted  as  xo,  rather  than  X.  Equation  (8)  gives  the  position  of  a 
particle  at  the  end  of  the  (n+1)  period,  given  its  position  at  the  end  of  the  nth  period.  Since 
they  are  derived  from  periodic  velocity  fields,  a  moping  may  be  composed  with  itself: 


=  MMx„=M\ 

(10) 

=  M^x!S 

(11) 

Of  course,  two  different  mappings  may  be  composed  together;  i.e.,  if  xi  =  Mxq  and  X2  = 
Nxj,  then  X2  =  NMxq.  In  some  respects,  a  mapping  does  not  contain  quite  as  much 
information  as  the  corresponding  motion.  However,  it  does  possess  nr  ost  of  the  important 
qualitative  characteristics.  It  might  be  argued  that  these  considerations  apply  to  too  sinq>le 
cases.  However,  a  complete  un^rstanding  of  time-periodic  flows  seems  necessary  before 
venturing  into  general  unsteady  flows. 

4.  Stretehing  and  Regular  Flows 

Stretching  lies  at  the  heart  of  mixing.  Stretching  governs  the  fine  scale  of  passive  scalars 
dispersed  in  the  flow  and  acts  as  a  fabric  for  the  evolution  of  diffusing  scalars  in  the  flow. 
To  quantify  the  amount  of  stretching  which  occurs  around  a  particle  we  follow  a  small 
material  vector  SS.  attached  to  the  particle.  The  length  stretch,  "K  is  simply  the  ratio  of  the 
length  a  time  t,  to  the  initial  length: 
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The  orientation  vector,  denoted  m,  is  simply  SX  ncxmalized  to  unit  length: 


The  time  evolution  of  the  length  stretch  can  be  written  as 

X 

—  =  (Vv)^:imn  =  D:iiim  (14) 

where  D  is  the  symmetric  part  of  the  velocity  gradient  tensor,  Vv.  By  the  (Tauchy- 

Schwarz  inequality.  A/  A  is  bounded  by  (D:D)i^  (since  the  magnitude  of  the  dyad  mm  is 
equal  to  one).  The  normalized  stretching  rate  is  called  the  stretching  efficiency: 

In  an  n-dimensional  flow,  the  efficiency  can  attain  a  maximum  value  of  (l-l/n)i^. 

In  many  flow  systems,  the  instantaneous  values  for  both  specific  stretch  rate  and  efficiency 
vary  erratically  in  time.  More  useful  quantities  are  the  time  averaged  values,  and  : 


1  f'rk  M  Ir^(lnA)  In  A 

D:mm dt--  \  dt  = - 

,Jo  ,Jo  d(  t 


1  r  D:mm 


-IT— 

'rJo(D; 


rJo(D:D)''" 

A  system  is  considered  efficient  for  mixing  if  the  long  time  value  (i.e.,  as  r  -4 «»)  of 
(or  equivalently  tends  to  a  positive  value,  regardless  of  the  initial  orientation  of  the 
material  filament  SX. 

A  complicated  stretching  function,  with  a  nearly  constant  time  average,  is  a  symptom  of 
“chaotic  advection.”  Steady  two-dimensional  flows  with  V  •  v=0  cannot  produce  chaotic 
advection;  stretching  is  linear  in  time,  the  stretching  function  decays  as  Ht,  and  the 
efficiency  decays  to  zero.  This  can  be  seen  in  various  ways.  A  steady  area  preserving  two- 
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dimensional  flow  is  characterized  by  the  streamfunction  \i/{x,y).  Level  curves 
il/{x,y,t  =  fixed)  give  the  instantaneous  picture  of  the  streamlines  which  in  this  case 
coincides  with  the  pathlines  and  streaklines.  If  the  flow  is  bounded,  the  flow  can  be 
divided  into  regions  of  closed  streamlines  and  the  stretching  within  e^h  region  is  poor.  In 
fact,  if  we  let  T{  y/)  denote  the  period  in  the  streamline  ,  it  is  then  possible  to  show  that 
dx(0  is  mapped  into  dx(t+T)  at  time  t+T: 

dx(r  +  7’)  =  dx(r)  [l-(dr/dv^)(VV^)v]+  higher  order  terms  in  dx  (18) 

and  that  the  orientation  of  the  filament  after  n  cycles  of  the  flow  is  given  by 

-  {dT  /  d\j/){V y/)yj  /  X,  (19) 

where  mo  is  the  initial  orientation.  As  the  number  of  cycles  goes  to  infinity,  the  filament 
becomes  aligned  with  the  streamlines  and  the  stretching  X  becomes  linear  with  time 
(Franjione  and  Ottino,  1991). 

5.  Chaos  in  Area-Preserving  Flows 

The  most  understood  case  of  chaotic  advection  corresponds  to  area-preserving  flows.  The 
understanding  of  this  case  resides  in  knowing  something  about  the  periodic  points  of  the 
flow  and  their  associated  manifolds.  Let  us  review  briefly  some  of  the  main  concepts. 
Given  a  flow  x  =  ^(X,/),  P  is  a  fixed  point  of  the  flow  if 

P  =  <!>(P,T)  (20) 

for  all  time  t  (i.e.,  the  particle  located  at  the  position  P  stays  at  P).  On  the  other  hand,  the 
point  P  is  periodic,  of  period  T,  if 


P  =  <t>{P,nT)  (21) 

for  n  =  1, 2, 3,...  but  not  for  any  t<T.  That  is,  the  material  particle  that  happened  to  be  at  the 
position  P  at  time  t=0  will  be  located  in  exactly  the  same  spatial  position  after  a  time  nT  [it 
could  be  anywhere  for  nT<t<(n+l)T\.  Similar  definitions  apply  to  a  period-p  points  (for 
example,  a  period-2  point  returns  to  P  for  n  =  2, 4, 6,...).  It  is  important  to  stress  that  the 
concept  of  periodicity  depends  on  the  frame  of  reference.  Thus,  for  example,  there  are 
periodic  points  in  a  moving  frame  in  the  cat-eyes  portrait  in  a  shear  flow,  but  there  are  none 
in  a  fixed  frame  (see  Ottino,  1989a;  Shariff  et  al.,  1991).  Periodic  points  can  be  classified 
as  hyperbolic,  elliptic,  or  parabolic,  according  to  the  deformation  of  the  fluid  in  the 
neighborhood  of  the  periodic  point  (the  parabolic  case  being  degenerate).  The  character  of 
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tte  flow  in  the  neighborhood  of  the  periodic  point  is  given  by  the  eigenvalues  of  the 
linearized  nuking: 


(22) 

where  JD  denotes  the  operation  According  to  the  value  of  the  eigenvalues  , 

the  point  P  is  called  hyperbolic,  elliptic,  or  parabolic: 

Hyperbolic  )A,|  >  1  >  jAjj,  AjAj  =  1,  (23a) 

Elliptic  |A*|  =  /  (ik  =  1,  2)  but  A*  ^  1,  (23b) 

Parabolic  A^  =  ±l  (fc  =  1,  2).  (23c) 


The  net  motion  in  the  neighborhood  of  an  elliptic  periodic  point  is  rotation;  the  motion  in 
the  neighborhood  of  hyperbolic  point  is  contraction  in  one  direction  and  stretching  in 
another. 

Hyperbolic  points  have  associated  invariant  regions  of  inflow  and  outflow  called  the  stable 
(W^(P)]  and  unstable  [W“(P)]  manifolds: 

(P)  s  {all  X  G  R2  s.  t.  ^,(X)  P  as  /  -» «»}  (24a) 

VP‘'(P)  =  (aU  X  G  R2  s.t.  0,(X)  P  as  r  ^  -«»}  (24b) 

Fluid  particles  leave  the  neighborhood  of  P  through  W^P)  and  get  back  to  P  via  Wi(P). 
Physically,  the  unstable  manifold  corresponds  to  a  streakline  injected  at  the  periodic  point. 
By  definition,  the  sets  1^(P)  and  1^(P)  are  invariant;  a  particle  belonging  to  one  of  the  sets 
does  so  permanently  and  caimot  escj^  from  it  In  bounded  steady  flows,  the  outflow 
\Pw(P)  joins  smoothly  into  the  inflow  1P«(P);  in  this  case  nothing  interesting  happens. 

In  time-periodic  flows  the  manifolds  might  intersect  non-tangentially.  A  point  belonging 
simultaneously  to  both  the  stable  and  unstable  manifolds  of  two  different  fixed  (or 
periodic)  points  P  and  Q  is  called  a  transverse  heteroclinic  point.  If  P=<}  the  point  is 
called  homoclinic;  if  P  ^  Q  the  point  is  called  heteroclinic.  One  intersection  implies 
infinitely  many  and  sensitivity  to  initial  conditions.  The  sensitivity  to  initial  conditions,  or 
exponential  divergence  of  initial  conditions,  is  measured  by  means  of  Lyapunov  exponents. 
The  Ly^unov  exponent  is  the  long-time  average  of  tiw  specific  rate  of  stretching, 
DlnX/Dt. 
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(25) 


Thus,  the  average  stretching  efficiency  can  be  interpreted  as  a  normalized  Lyapunov 
exponent  [with  respect  to  (D:D)»/2]. 

An  important,  and  often  misunderstood,  distinction  should  be  made  between  fixed  points 
of  velocity  flelds  and  maps.  Given  a  flow  x  =  ^(X,r),  P  is  di  fixed  point  of  the  flow  if 

P  =  0(P.O  (26) 

for  all  time  t  (i.e.,  the  particle  located  at  the  position  P  stays  at  P);  equivalently  v(P,r>sO  for 
all  t.  A  critical  point,  on  the  other  hand,  corresponds  to  locations  such  that  v(P,r)=0  at 
some  time  t.  Fixed  and  critical  points  corresponding  to  isochoric  two-dimensional  flows 
can  be  hyperbolic  or  saddle  type,  elliptic,  or  parabolic;  the  character  of  the  fixed  point  can 
be  obtained  by  linearizing  the  velocity  field  (as  opposed  to  the  motion)  near  P.  There  is  a 
key  difference  between  periodic  points  and  critical  points.  A  periodic  point  is  a  material 
.  point;  a  critical  point  is  not.  Thus,  if  one  were  able  to  place  a  labeled  fluid  particle  at  any 
arbitrary  time  on  a  periodic  point  the  particle  will  faithfully  record  the  motion  of  the 
periodic  point  for  all  times.  Such  a  thought  experiment  is  not  possible  with  a  critical  point. 
A  critical  point  might  appear  or  disappear  according  to  when  the  flow  is  looked  at;  a 
periodic  point  cannot  possibly  disappear.  This  is  a  point  that  often  escapes  people 
interested  in  visualizing  flows.  An  estimation  of  the  mixing  abilities  of  flows  based  on 
streamline  portraits  can  be  misleading.  This  has  been  pointed  out  in  the  past  (Hama, 

1962),  but  is  worth  repeating,  primarily  when  viewed  in  the  context  of  what  happens  in 
two-dimensional  time  periodic  chaotic  flows. 

A  final  comment  should  be  made  about  periodic  points.  It  often  h^pens  that  the  simple 
prototypical  chaotic  systems  studied  in  the  context  of  chaotic  advection  present  symmetry 
properties.  Mathematically,  two  maps  A  and  B  are  said  to  be  symmetric  to  each  other  if 
there  exists  a  transformation  S  such  that 


B  =  SAS-' 


(27) 


If  A  =  B,  the  symmetry  is  termed  ordinary;  if  A-i  =  B,  the  symmetry  is  termed  time- 
reversal.  In  general,  S  can  be  a  rotational  symmetry  or  reflectional  symmetry.  An 
important  consequence  of  this  is  that  if  a  map  possesses  symmetry,  the  periodic  points  are 
found  in  symmetric  arrangements. 


362 


OTTINO 


6.  Statistical  Tools 

The  bulk  of  the  systems  studied,  to  dale,  in  chaotic  advection  are  deterministic,  exceptions 
being  attempts  to  introduce  molecular  diffusion  into  the  description  given  by  equation  (2) 
or  random  forcing  instead  of  periodic  forcing.  However,  to  the  extent  that  outcomes  are 
chaotic,  statistical  tools  provide  useful  guidance  in  the  analysis  of  various  systems.  A 
particularly  useful  tool  is  single-parameter  scaling,  which  sits  somehow  in  the  broader 
context  of  multiplicative  processes  (Redner,  1990).  A  simple  explanation  of  the  main  facts 
can  be  put  forth  in  terms  of  stretching. 

Consider  a  large  number  of  points — each  with  an  associated  vector  SS. — advected  by  a 
time  periodic  flow.  Let  dN(A)  be  the  number  of  points  with  stretching  between  A  and 
A  +  dA .  The  probability  of  a  point  having  a  stretching  A  after  n  periods  is 
F„(A)  =  dN{A)/d  A;  similarly  H„(log  A)  =  dN(log  A)/d(log  A);  the  distributions  F„(A) 
and  H„(log  A)  are  related  by  H„(log  A)  =  AF„(A).  Such  distributions  may  be  analyzed 
by  single  parameter  scaling. 

The  main  idea  is  the  following.  A  distribution,  G(  ),  is  said  to  have  single-parameter  self¬ 
similarity  if  under  a  transformation  of  variables 

x-^y  =  xlX{n\  (28) 

Gn{x)^g{y)  =  K{n)G„{x),  (29) 

the  function  Q{y)  becomes  (asymptotically)  independent  of  n;  Xiji)  can  be  obtained  as  the 
ratio  of  two  successive  convergent  moments,  X(n)=  where  m\  is  given  by 

»^M  =  j^x‘G„{x)dx,  (30) 


whereas  K(n)  is  given  by 


K{n)  =  C^X{nf  !  m^{n)  (31) 

where  Ci  is  a  constant.  It  is  apparent  that  this  technique  allows  for  the  computation  of  the 
evolution  of  the  moments  of  the  distribution  (Muzzio  et  al.,  1991). 

Another  potentially  useful  technique  is  multifractal  scaling.  The  most  fruitful  application  of 
this  concept  in  fluid  mechanics,  so  far,  has  been  in  the  context  of  turbulence  (Sreenivasan, 
1991).  The  explanation,  again,  is  in  terms  of  stretching.  Consider  the  field  of  A(x,t), 
corresponding  to  a  very  large  number  of  initial  conditions  X  distributed  in  a  domain  V. 
Divide  V  into  boxes  of  equal  size  r  and  label  each  box  by  an  index  i.  The  measure  /r,(i)  is 
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the  amount  of  A  in  the  box,  of  volume  Vr,  normalized  by  the  total  amount  of  A  in  all 
the  boxes: 


A(x)dV 


X{\)dV 


(32) 


In  turn  the  measure  /r,(/)  can  be  used  to  define  the  strength  a{i)  as 

^^rU)  ~  «(0  =  log(/^r(0) !  log(r).  (33) 

Multifractal  behavior  corresponds  to  the  case  where  the  probability  density  function  of  a 
exhibits  self-similar  behavior  over  a  range  of  length  scales  r.  This  implies  that  the  number 
of  boxes  iV,(«)  where  a  has  values  in  a  range  between  a  and  a  +  da  can  be  expressed 
in  terms  of  an  invariant  function  f{a),  according  to 

NXa)da  ~  (34) 

f{a)  is  called  the  multifractal  spectrum.  The  use  of  multifractal  concepts  in  chaotic 
advection  is  discussed  in  Muzzio  et  al.  (1992). 

7.  Systems  Studied 

It  might  be  argued  that  the  typical  systems  studied,  to  date,  in  the  context  of  chaotic 
advection  are  unrealistic — ^and  hence  irrelevant — for  an  oceanographic  viewpoint.  That 
would  be  a  mistake.  The  proper  way  to  understand  these  examples  is  not  as  faithful 
representations  of  real  systems  but  rather  as  analyzable  prototypes  yielding  physical  insight 
and  increased  basic  knowledge.  They  act,  in  short,  as  a  sort  of  yardstick  with  respect  to 
which  we  can  measure  the  understanding  of  realistic  advection  problems.  Undoubtedly 
there  are  situations,  such  as  tidal  systems,  that  are  well  suited  for  immediate  applications 
(Ridderinkhof  and  Zimmerman,  1992).  Applications  to  more  complex  systems  still  lie  in 
the  future. 


Possibly  the  simplest  systems  are  the  tendril-whorl  flow  (Khakhar  et  al.,  1986)  and  the 
egg-beater  flow  (Franjione  and  Ottino,  1992).  The  tendril-whorl  flow  is  a  discontinuous 
succession  of  extensional  flows  and  twist  maps.  The  physical  motivation  for  this  flow  is 
that,  locally,  any  velocity  field  can  be  decomposed  into  extension  and  rotation.  The  egg- 
beater  flow  on  the  other  hand  can  be  seen  as  a  flow  occurring  in  a  square  region  of 
observation  periodically  invaded  by  shear  flows  entering  at  right  angles  from  each  other. 
The  first  shear  flow  acts  in  a  “horizontal”  direction: 


(35a) 
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(35b) 

where  T  is  the  duration  of  the  flow.  This  flow  is  written  as  =  Hx„,  where  x=(jf,>'). 

The  second  flow  acts  in  a  “vertical”  direction: 


(36a) 

=>’,+7’v(^n)  (36b) 

and  is  written  as  x„^.,  =  Vx, .  The  flow  occurs  in  a  domain  which  is  periodic  in  both  the  x 

and  y  directions.  The  overall  mapping  may  be  written  as  the  composition  of  both  m£^>s, 
i.e., 


x«>i=VHx„=Ex„.  (37) 

A  sequence  of  actions  of  the  horizontal  and  vertical  components,  H  and  V,  is  denoted  as 
VHVHVH  •••  and  is  an  example  of  a  mixing  protocol. 

The  next  simplest,  but  historically,  the  first  flow  analyzed  in  the  context  of  chaotic 
advection,  is  the  blinking-vortex  flow  (Aref,  1984;  Khakhar  et  al.,  1986)  which  consists  of 
two  corotating  fixed  point  vortices  that  blink  on  and  off  periodically  with  a  constant  period 
T.  At  any  given  time,  only  one  of  the  vortices  is  on,  so  that  the  motion  is  made  up  of 
consecutive  twist  maps  about  different  centers. 

All  these  flows  are  computational.  There  are  several  experimentally  realizable  flows 
though,  mostly  two-dimensional,  although  a  couple  of  experiments  have  been  carried  out 
in  three  dimensional  flows  as  well.  The  first  example  of  a  two-dimensional  flow  is  the 
cavity  flow  (Chien  et  al.,  1986;  Leong  and  Ottino,  1989).  The  cavity  flow  consists  of  a 
rectangular  region  capable  of  producing  a  two-dimensional  velocity  field  in  the  x-y  plane. 
Two  opposing  walls  can  be  moved  in  a  steady-  or  time-dependent  maimer  inducing 
circulation  within  the  cavity  with  one  of  multiple  cells  according  to  the  aspect  ratio  of  the 
cavity  and  the  mode  of  operation  of  the  walls.  Several  new  studies  are  focusing  on 
transport  away  for  open  cavities  (Jana  and  Ottino,  1992)  as  well  as  systems  involving  one 
or  two  cylinders  rotating  in  a  circular  containers.  The  two  cylinder  case  is  the  so-called 
journal  l^aring  flow  (Chaiken  et  al.,  1987;  Swanson  and  0>ttino,  1990).  Only  a  few  studies 
have  been  reported  for  three-dimensional  flows  (Kusch  and  Ottino,  1992). 

There  are  several  insights  that  have  been  gained  in  terms  of  these  flows.  The  first  insight  is 
that  passive  structures  in  time-periodic  flows  evolve  in  an  iterative  fashion;  an  entire 
structure  is  mapped  into  a  new  structure  with  persistent  large-scale  features,  but  finer  and 
finer  scale  features  are  revealed  at  each  period  of  the  flow.  Thin  striations  are  produced  at 
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the  expense  of  thicker  ones,  and  length  scales  (characterized  by  the  first  moment  of  a 
striation  thicknesses  distribution)  decrease  exponentially  in  time.  The  length  stretch  and 
striations  thicknesses  are  inversely  related.  It  has  also  been  found  that  islands  form 
coherent  regions  that  translate,  stretch,  and  contract  periodically  and  undergo  a  net  rotation, 
preserving  their  identity.  Islands  display  symmetry  at  regular  intervals  of  time.  Island 
symmetry  is  caused  by  symmetric  placement  of  elliptic  points.  The  flow  within  islands  is 
weakly  rotational,  the  stretching  is  linear,  and  the  rates  of  rotation  are  usually  much  slower 
than  in  the  rest  of  the  flow.  Rotation  notwithstanding,  it  would  be  a  gross  mistake  to 
identify  these  coherent  regions  as  regions  of  vorticity. 

Another  insight  has  to  do  with  island  destruction.  For  example,  using  the  case  of  the  egg- 
beater  flow,  it  is  known  what  sequences  of  H’s  and  V’s  lead  to  best  mixing  in  a  minimum 
number  of  periods.  Another  insight  has  to  do  with  resonance  and  conditions  leading  to 
coupling  between  a  base  flow  and  a  perturbation.  Some  simple  cases  admit  analytical 
treatment.  A  recent  example  in  the  context  of  oceanography  is  the  paper  by  Samelson 
(1992)  addressing  the  issue  of  fluid  exchange  across  a  meandering  jet  in  terms  of  the 
Melnikov  method. 

Another  general  statement  that  can  be  made  regarding  chaotic  advection  and  transport  is 
that  the  rate  of  spreading  is  controlled  by  the  unstable  manifolds  of  the  hyperbolic  points 
belonging  to  the  lowest  order  periodic  points.  The  stretching  is  roughly  proportional  to  the 
value  of  the  eigenvalues  and  is  inversely  proportional  to  the  period  of  the  point.  An 
analysis  in  terms  of  manifolds  can  yield  valuable  information  regarding  the  transport  of 
material  in  the  flow.  All  applications  so  far  have  been  in  terms  of  rather  idealized  flows. 
Consider,  now,  the  application  of  these  concepts  to  one  of  the  most  studied  flows  in  fluid 
mechanics,  but  from  the  viewpoint  of  transport  still  a  rather  poorly  understood  flow.  The 
flow  considered  is  the  time-periodic  vortex  shedding  past  a  two-dimensional  circular 
cylinder  with  diameter  D  placed  in  a  stream  of  fluid  moving  with  uniform  speed  U  in  x- 
direction.  As  is  well  known  if  Re  =  pUDJii  « 1  the  flow  is  symmetric  with  respect  to 
both  the  x-axis  and  the  y-axis;  as  the  Re  increases  the  flow  loses  y-symmetry  and  two 
attached  eddies  form  behind  the  cylinder  which  grow  in  size  with  increasing  Re  until,  at 
Re  =  40,  the  flow  ceases  to  be  steady  and  becomes  time-periodic.  Experiments  show  that 
when  Re  =  100  eddies  are  shed  periodically  from  the  top  and  bottom  part  of  the  cylinder: 
all  the  vortices  originating  from  the  top  rotate  in  one  direction;  all  the  vortices  originating 
from  the  bottom  rotate  in  the  opposite  direction  while  the  whole  pattern  of  vortices  travels 
downstream  but  with  a  speed  smaller  than  U.  As  Re  is  increased  above  200  or  so,  the 
flow  develops  three-dimensionality,  time-periodicity  is  lost  and  the  flow  ultimately 
produces  a  turbulent  wake.  The  case  of  interest  here  is  in  the  range  100<Re<200. 
Streakline  experiments  produce  the  so-called  von  Karman  wake,  something  that  has  been 
known  for  eight  decades  or  so.  However,  an  understanding  on  how  transport  proceeds  in 
this  flow,  i.e.,  how  parcels  of  fluids  move  from  one  place  to  another  and  entrain  material,  is 
still  far  from  being  clear.  Instantaneous  streamlines  offer  only  partial  help,  even  though 
most  of  the  recent  attempts  at  explaining  this  topic  address  the  problem  from  this 
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viewpoint  (Perry  et  al.,  1982).  An  analysis  in  terms  of  manifolds  improves  the  picture 
considerably  (Shariff  et  al.,  1991). 

Such  an  analysis  relies  on  the  identification  of  two  classes  of  periodic  points;  (i)  parabolic 
periodic  points  associated  with  separating  and  attaching  streamlines,  which  produce 
unstable  and  stable  manifolds  that  are  associated  with  zero  wall  shear,  parabolic  points,  and 
(ii)  a  period-one  hyperbolic  point  located  in  the  wake  itself.  Such  points  move  as  function 
of  time  and  return  to  their  original  location  after  one  full  period.  The  typical  picture  is  as 
follows;  the  streamline  corresponding  to  one  of  the  separation  points  joins  smoothly  with 
an  attachment  point  to  form  a  separation  bubble  whereas  another  separation  streamline 
goes  into  the  wake  of  the  flow.  Stable  and  unstable  manifolds  produce  heteroclinic  and 
homoclinic  intersections.  In  this  particular  flow,  four  types  of  transversal  intersections  are 
possible;  heteroclinic  intersections  are  produced  by  intersections  of  stable  manifolds  of  the 
period-one  hyperbolic  with  unstable  manifolds  of  the  periodic  points  attached  to  the 
cylinder  as  well  as  by  unstable  manifolds  of  the  hyperbolic  point  intersecting  the  stable 
manifolds  attached  to  the  surface  of  the  cylinder,  homoclinic  intersections  are  produced  by 
crossings  of  stable  and  unstable  manifolds  belonging  to  the  hyperbolic  point  as  well  as 
those  of  parabolic  points  attached  to  the  cylinder.  The  complete  manifold  picture  of  the 
system  is,  however,  more  complex  since  there  are  additional  period-one  hyperbolic  points 
close  to  the  surface  of  the  cylinder  as  well  as  higher  order  periodic  points;  they  however, 
seem  to  contribute  much  less  to  the  gross  aspects  of  the  transport  in  the  flow.  The 
manifold  structure  results  in  the  qualitative  picture  shown  in  Figure  3.  Figure  4  shows 
computed  pictures  corresponding  to  Re=180.  For  the  sake  of  clarity  Figure  4a  shows  only 
the  manifold  structure  corresponding  to  the  upper  wake;  whereas  Figure  4b  shows  the 
manifold  structure  corresponding  to  the  lower  wake. 

The  manifold  structure  provides  a  template  for  stretching  and  transport  and  provides  a 
qualitative  picture  for  the  stretching  and  folding  of  a  streakline  in  the  wake.  Mote  and  mote 
details  are  revealed  as  the  system  evolves.  This  sort  of  iterative  process  has  implications  for 
the  distribution  of  stretching  within  the  flow.  This  is  particularly  clear  in  the  case  of  time 
periodic  flows.  In  this  case  the  stretching  between  period  0  and  period  n,  „ ,  can  be 
expressed  as  a  multiplication  of  stretching  corresponding  to  individual  periods,  i.e., 

^0,n  ~  ^0,I^I.2"-^n-l.>i’  (38) 

where  is  the  stretching  experienced  in  the  interval  i-\  to  i.  Moreover,  due  to  the 
chaotic  character  of  the  flows,  the  , ’s  quickly  become  uncorrelated.  These  two 
observations  suggest  that  stretching  can  be  considered  as  a  multiplicative  process  with 
loosely  correlated  steps  and  are,  therefore,  an  ideal  situation  for  the  application  of  scaling 
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Figure  3.  Qualitative  picture  of  manifold  structure  in  the  vortex  shedding  regime  of  flow  behind  a 
circular  cylinder. 

concepts.  The  application  of  single  parameter  scaling  concepts  shows  that  as  the  number 
of  periods  increases  beyond  5  or  so  a  wide  portion  of  the  probability  density  functions  of 
stretching  overlap  when  re-plotted  in  scaled  form.  Closer  examination  of  the  scaled  results 
reveals  additional  insight;  in  general,  flows  with  islands  exhibit  spatial  segregation  with 
respect  to  stretching  even  within  chaotic  regions;  one  set  of  points  wanders  throughout  the 
‘bulk  of  the  chaotic  region’  and  undergoes  exponential  stretching;  the  other  stays  close  to 
regular  islands  for  many  periods  and  stretches  very  slowly. 

Another  useful  tool  is  multifractals.  The  simplest  ^plication  of  multifractal  concepts  to 
mixing  arises  in  the  case  of  flows  with  no  islands.  In  this  case,  the  spatial  distribution  of 
stretching  is  well  described  by  multifiractal  scaling  if  the  very  high  tail  of  the  distribution  of 
stretchings  is  neglected.  Moreover,  different  methods  for  obtaining  the  multifractal 
spectrum  f{a)  agree  reasonably  well,  producing  a  time-independent  self-similar 
distribution.  For  flows  with  islands  (e.g.,  the  flow  between  eccentric  cylinders),  the 
spectrum  /(of)  is  time-dependent  and  therefore,  it  is  not  self-similar).  However, 
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Figure  4.  Intersection  between  unstable  manifolds  associated  with  parabolic  points  attached  to  the 
cylinder,  and  stable  manifolds  associated  with  a  periodic  hyperbolic  point,  H,  at  Reynolds  number 

180:  (a)  represents  the  manifold  structure  corresponding  to  the  upper  wake,  (b)  the  represents  the 
manifold  structure  corresponding  to  the  lower  wake. 

multifractal  concepts  suggest  a  single-parameter  scaling  for  the  distribution  of  Ly^unov 
exponents  that  works  well  for  flows  without  islands  (Muzzio  et  al.,  1992).  A  possible 
point  of  confluence  of  scaling  concepts,  multifractal  descriptions  and  transport  is  in  the 
interpretation  and  prediction  of  dispersion  of  passive  scalars.  Some  work  has  been  done 
(Pasmanter,  1988),  but  it  is  obvious  that  much  more  remains  to  be  done. 
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8.  ConcluskMtis 

Some  familiarity  with  chaotic  advection  appears  to  be  a  necessary  ingredient  in  developing 
an  understanding  of  mixing  and  dispersion  in  complex  flows.  It  is  apparent  that  the  current 
kinematical  vocabulary  necessary  to  deal  with  stirring  and  mixing  needs  to  be  ampUfied. 
Chaotic  advection  clearly  demonstrates  the  pitfalls  of  flow  visualization  in  terms  of  velocity 
field  information  such  as  instantaneous  streamlines  and  particle  paths;  both  can  be 
relatively  simple  and  streaklines  extremely  complex.  Concepts  such  as  periodic  points  and 
manifolds  seem  both  useful  and  necessary  in  interpreting  issues  involving  coherence  and 
transport.  The  flow  within  coherent  islands  in  two-dimensional  chaotic  flows  is  weakly 
rotational  (in  the  sense  that  there  is  a  net  twist)  but  that  rotation  notwithstanding,  it  would 
be  a  gross  mistake  to  identify  these  coherent  regions  as  regions  of  vorticity. 

Advances,  to  date,  are  mostly  in  the  form  of  physical  insight  and  basic  knowledge  obtained 
in  terms  of  computational  and  experimental  studies  in  simple  flows.  Currently  available 
results  can  be  used  in  two  different  ways;  (i)  to  make  qualitative  predictions  regarding  the 
behavior  or  mote  complex  systems,  (ii)  as  a  yardstick  with  respect  to  which  we  can 
measure  the  understanding  (or  lack  thereof)  of  such  problems.  Most  studies  are  for  two- 
dimensional  flows  but  attempts  at  extending  analyses  to  three-dimensional  cases  are 
currently  underway.  However,  many  problems  of  interest  in  ocean  sciences  are  inherently 
two-dimensional.  The  most  obvious  example  might  involve  lateral  mixing  descriptions  in 
terms  of  large  circulation  models.  Other  problems  can  be  encountered  at  smaller  scales. 
Examples  might  include  stirring  in  tidal  systems,  an  inherently  time  periodic  case, 
transport  and  entrainment  in  meandering  jets,  and  penetrative  convection  under  ice  shelves. 
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ABSTRACT 

Three  topics  relating  to  chaotic  ocean  physics  are  discussed.  These  are  (1)  low  ordo*  El 
Niho  dynamics,  (2)  lateral  stirring  processes,  and  (3)  linear  ocean  waves  in  the  geometric 
limit.  Each  topic  is  discussed  separately;  emphasis,  in  each  case,  is  given  to  the  manner  in 
which  ideas  associated  with  chaos  and  low-order  dynamical  systems  complement  more 
traditional  approaches  to  the  same  problem. 

INTRODUCTION 

In  this  paper  three  topics  relating  to  chaotic  ocean  physics  are  discussed.  This  list  is  not 
intended  to  be  an  exhaustive  list  of  topics  in  ocean  physics  to  which  ideas  relating  to  chaos 
can  be  applied.  Our  discussion  of  these  three  topics — which  were  chosen  because  the 
author  has  some  familiarity  with  them — serves  to  illustrate  several  important  concepts 
likely  to  be  useful  in  other  oceanographic  applications  as  well.  It  is  our  feeling  that  the 
ideas  relating  to  chaotic  dynamical  systems  discussed  in  this  paper  are  useful  but  must  be 
applied  in  a  sober  fashion  which  complements  more  traditional  approaches.  When 
properly  applied,  these  ideas  provide  a  vehicle  to  increase  our  understanding  of  various 
physical  processes  in  the  ocean  in  an  evolutionary  fashion.  Expectations  of  gaining  new 
insight  of  a  revolutionary  nature  are  not  likely  to  be  realized. 

In  each  of  the  three  sections  that  follow,  we  discuss  a  topic  in  ocean  physics  (low-order  El 
Niho  dynamics,  lateral  stirring  processes,  linear  ocean  waves  in  the  geometric  limit)  to 
which  ideas  associated  with  chaos  can  be  applied.  Background  material  relating  to 
dynamical  systems  and  chaotic  dynamics  is  introduced  as  necessary  in  the  context  of  the 
problems  treated.  This  approach  is  natural  inasmuch  as  our  intention  is  not  to  provide  a 
tutorial  on  chaos;  instead,  we  seek  to  demonstrate  that  these  ideas  are  useful  in  the  context 
of  specific  problems  in  ocean  physics.  All  three  topics  discussed  in  this  paper  are  treated 
in  more  detail  elsewhere;  references  are  provided  below.  So  as  not  to  duplicate  this 
material,  we  focus  here  on  the  rationale  for  applying  ideas  relating  to  chaos  and  low-order 
dynamical  systems.  Stated  somewhat  differently,  in  this  p^r  we  focus  more  on  the 
questions  being  addressed  than  on  details  of  the  subsequent  analysis.  Some  unifying 
conunents  and  observations  concerning  chaotic  ocean  physics  are  included  in  the  final 
section. 
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LOW-ORDER  EL  NINO  DYNAMICS 

The  El  Niho/Southem  Oscillation  (ENSO)  system  is  a  quasi-periodic  oscillation  of  the 
tropical  Pacific  Ocean  and  overlying  atmosphere  (see,  e  g.,  Enfield,  1989).  The  ENSO 
system  involves  interactions  among  eastern  basin  sea  surface  temperature  (SST),  zonal 
trade  winds  and  the  thermocline  depth  (Bjerknes,  1969;  Wyrtki,  1975).  El  Niflo 
events — characterized  by  anomalously  high  eastern  basin  SST,  weak  trade  winds,  and  a 
shallow  western  basin  thermocline — are  separated  by  three  to  five  years,  typically. 

Models  of  the  ENSO  system  vary  considerably  in  complexity.  At  one  extreme  are  coupled 
ocean-atmosphere  general  circulation  models  (see,  e  g.,  Neelin,  1990).  That  such  models 
produce  ENSO-like  behavior  should  come  as  no  surprise;  ENSO  behavior  is  surely 
contained  in  the  complicated  coupled  equations  of  motion/state  which  were  numerically 
solved.  It  is  our  feeling  that  simpler  models — provided  they  adequately  reproduce 
essential  features  of  the  system  being  modeled — are  more  insightful  inasmuch  as  they 
better  elucidate  the  essential  physical  processes  involved.  This  leads  naturally  to  the 
question  of  whether  the  essential  physics  of  the  ENSO  system  can  be  captured  in  simpler 
models. 

The  simplest  type  of  model  of  the  ENSO  system  which  has  been  proposed  consists  of  a 
small  number  (n,  say)  of  autonomous  ordinary  differential  equations, 

(1) 

dt 

The  solution  x(t)  of  these  equations  describes  the  temporal  evolution  of  the  system.  The 
x^s  (i  =  l,2,...n)  in  such  a  model  would  include  variables  such  as  anomalies  of  eastern 
basin  SST,  zonal  winds  and  western  basin  thermocline  depth.  Vallis  (1986)  was  the  first 
to  propose  a  model  of  the  ENSO  system  of  this  type.  That  this  model  produces 
unphysical  behavior  for  some  choices  of  parameters  (see,  e  g.,  Vallis,  1988)  is,  in  our 
opinion,  not  terribly  important;  the  significance  of  the  Vallis  (1986)  paper  is  the 
suggestion  that  the  essential  physics  of  the  ENSO  system  can  be  captured  in  severely 
truncated  physical  model  consisting  of  a  low-order,  autonomous  dynamical  system.  The 
word  autonomous  means  that  the  function/ in  (1)  does  not  depend  explicitly  on  time; 
physically,  this  restriction  means  that  any  quasi-oscillatory  behavior  in  x/0 — ^which  might 
be  associated  with  the  occurrence  of  El  Niflo  events — is  the  result  of  internal,  self- 
sustained  dynamical  processes  rather  than  being  the  response  to  external  stochastic 
forcing.  More  recently,  improved  low-order  models  (autonomous  dynamical  systems)  of 
the  ENSO  system  have  been  proposed  by  Schopf  and  Suarez  (1988)  and  Munnich  et  al. 
(1991).  Before  proceeding,  it  is  worth  noting  that  the  notion  of  simple  ENSO 
dynamics — during  the  growth  phase  of  El  Niflo  events,  at  least — is  generally  accepted  and 
dates  back  to  the  seminal  work  of  Bjerknes  (1969)  and  Wyrtki  (1975);  the  notion  that  the 
complete  ENSO  cycle — and,  in  particular,  the  tiiggering  of  El  Niflo  events — results  from 
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internal  dynamics  (i.e.,  that  these  oscillations  are  self-sustained)  is  not  universally 
accepted. 

These  considerations  led  Bauer  and  Brown  (1992)  to  address  the  question  of  whether 
observations  of  the  ENSO  system  are  consistent  with  underlying  low-order  dynamics. 

This  question  was  addressed  via  the  process  of  phase  space  reconstruction  whereby 
discrete  samples  of  a  single  variable,  y{tj,  y  =  1 ,2, . . .  (monthly  samples  of  eastern  basin 
SST  were  used  in  the  Bauer  and  Brown  analysis),  are  used  to  construct  a  discretely 
sampled  multidimensional  phase  space  portrait, ^  =  1,2, ...  A  simple  way  to  carry 
out  this  process  is  to  use  delay  time  coordinates:  y,(tj  =  y(tj,  y/tj  =  y(t,,^J,  y/tiJ  = 
y0k*2X  cic.  Surprisingly,  perhaps,  the  reconstructed,  discretely  sampled  phase  space 
trajectory  y/('/t)  constructed  in  this  fashion  can  be  shown  (Broomhead  and  King,  1986), 
under  appropriate  conditions,  to  reproduce  with  only  minor  distortion  (a  diffeomorphism) 
the  true  multidimensional  phase  space  portrait  x(t)  of  the  underlying  dyiuunical  system. 
Unfortunately,  this  procedure  is  sensitive  to  noise  and  therefore  generally  works  poorly  on 
geophysical  data.  The  shortcoming  was  overcome  by  Bauer  and  Brown  by  using  a 
technique  developed  by  Broomhead  and  King  (1986) — see  also  Vautard  and  Ghil 
(1989) — wherein  temporal  empirical  orthogonal  functions  are  used  as  basis  functions  for 
the  reconstructed  phase  space  trajectory.  Details  of  this  analysis  will  not  be  repeated  here. 
The  results  of  this  analysis  suggest  that  the  underlying  ENSO  dynamics  are  approximately 
those  of  a  low-order  system;  we  urge  the  reader  to  carefiilly  assess  the  evidence  presented 
and  come  to  his/her  own  conclusion. 

It  is  worth  emphasizing  in  this  context  that  the  question  of  chaotic  ENSO  dynamics  is 
secondary  to  the  question  of  whether  ENSO  dynamics  are  approximately  those  of  a  low- 
order  system.  If  the  later  question  is  answered  affirmatively,  then  questions  concerning 
chaotic  beha\dor  become  relevant.  Among  these  are  (1)  Does  the  system  evolve 
chaotically,  and  if  so,  what  is  the  predictability  timescale  (reciprocal  of  the  largest  positive 
Lyapunov  exponent)?  (2)  What  is  the  dimension  of  the  corresponding  attractor?  At  the 
present  time  these  questions  are,  in  our  opinion,  premature.  It  is  worth  pointing  out, 
however,  that  if  the  underlying  dynamics  are  approximately  those  of  a  low-order 
system — even  a  chaotic  one — ^this  would  lead  to  some  long-term  predictability  in  the  sense 
that  it  would  be  known  that  the  system's  state  vector  x  must,  at  all  times,  lie  on  some 
attractor — although  its  precise  position  may  not  be  predictable. 

LATERAL  OCEAN  STIRRING  PROCESSES 

In  the  ocean,  many  water  properties  such  as  temperature,  salinity,  oxygen  content  or 
pollutant  concentration  can  be  treated  approximately  as  passive  fluid  parcel  markers. 
Passive  means  that  the  flow  field  evolves  independently  of  the  initial  distribution  of  the 
tracer.  In  order  to  understand  the  distribution  of  these  oceanic  tracers  and  how  they 
evolve  in  time,  one  needs  to  understand  the  process  by  which  passive  tracers  get 
redistributed.  Our  discussion  of  this  process  focuses  on  the  lateral  stirring  (advective 
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tracer  transport)  process;  we  ignore  the  quasi-difHisive  3-<l  behavior  that  takes  place  at  the 
smallest  scales  (internal  wave  and  smaller). 

The  advective  transport  of  a  passive  tracer  in  a  two  dimensional  incompressible  flow  is 
described  by  the  equation, 

dt  dy  dx  dx  dy 

subject  to  the  initial  condition  0(jc,y,O)  =  0^  (x,y,t).  Here  6(x.y,t)  is  the  tracer 
concentration  and  is  the  streamilinction.  It  follows  from  (2)  that  6  is  constant 

following  particle  trajectories,  x(t),  y(t),  which  satisfy 

dt  dy'  dt  dx 

Thus,  in  order  to  understand  the  temporal  evolution  of0(x,y,t) — even  in  a  statistical 
sense — one  needs  to  understand  the  behavior  of  particle  trajectories  and  understand  the 
implications  of  the  form  of  the  Lagrangian  equations  of  motion  (3). 

The  Lagrangian  equations  of  motion  constitute  a  generally  nonautonomous  Hamiltonian 
system  with  one  degree  of  freedom;  '^(x.y.t)  plays  the  role  of  the  Hamiltonian  H(p,q,t).  It 
is  extremely  important  to  distinguish  integrable  Hamiltonian  systems  from  nonintegrable 
ones.  For  the  system  (3)  integrability  implies  that  there  exists  a  single-valued  function 
Xfx,y,t)  which  is  constant  following  particle  trajectories,  dx/ai  -  0.  If  the  flow  is  steady, 
d\{r/dt  =  0,  then  the  system  of  equations  is  said  to  be  autonomous  and  the  streamfunction  is 
the  required  constant  of  the  motion,  d^t/dt  =  0.  This  follows  from  equations  (3).  In 
nonsteady  flows,  however,  the  equations  of  motion  (3)  are  nonautonomous  and  are 
generally  nonintegrable.  This  observation  is  important  inasmuch  as  nonintegrability  is  a 
necessary — ^but  not  sufficient — condition  for  chaotic  motion  (see,  e  g.,  Tabor,  1989). 

The  distinction  between  chaotic  and  nonchaotic  particle  trajectories  is  extremely  important 
in  the  context  of  passive  tracer  transport.  The  reason  is  that  chaotic  particle  trajectories 
exhibit  extreme  sensitivity  to  their  initial  conditions.  This  means  that  neighboring  particle 
trajectories  diverge  from  one  another  at  an  exponential  rate,  on  average.  It  follows  that 
material  lines  of  fluid  will  also  grow  exponentially,  on  average.  This  type  of  behavior 
leads  to  very  efficient  stirring  (advective  transport)  of  a  tracer,  and,  in  turn,  enhances  the 
mixing  (diffusive  transport)  of  the  tracer  at  smaller  scales.  These  ideas  are  discussed  in 
more  detail  by  Ottino  (1990)  (see  also  the  contribution  by  Ottino  in  this  volume)  and 
Brown  and  Smith  (1990, 1991).  The  latter  publications  also  address  the  question  of 
whether  proxy  ocean  particle  trajectories  (acoustically  tracked  submerged  SOFAR  floats) 
exhibit  extreme  sensitivity.  Previously,  Osborne  et  al.  (1986)  had  addressed  this  question 
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using  satellite-tracked  surface  drifters.  This  work  suggests  that  float/drifter  trajectories  do 
exhibit  the  important  property  of  extreme  sensitivity  which  is  associated  with  chaotic 
systems. 

It  is  important  to  note,  however,  that  typical  oceanographic  realizations  of  yf(x,y,t)  are 
significantly  more  complicated  than  the  idealized  systems  to  which  notions  relating  to 
chaos  are  normally  associated.  Specifically,  almost  integrable  systems  with  periodic  time- 
dependence  are  fairly  well  understood  (see,  e  g.,  Tabor,  1989).  In  such  systems,  the  onset 
of  chaos  is  associated  with  resonances  between  periodic  motion  in  the  nearby  integrable 
system  and  the  period  of  the  temporal  variations  of  the  streamflinction.  It  is  not  clear 
whether  results  which  apply  to  time-periodic  streamflinctions  carry  over  to  the  problem 
where  the  streamflinction  has  more  general  time  dependence;  there  remains  a  significant 
gap  between  the  complexity  of  the  ocean  and  that  of  the  idealized  systems  treated  in 
textbooks  on  nonlinear  dynamics. 

This  gap  in  complexity  offers  challenges  to  both  oceanographers  and  nonlinear  dynamicists 
and  provides  the  opportunity  for  the  two  groups  work  together  in  a  mutually  beneficial 
fashion.  In  fact,  this  has  already  happened.  In  the  aforementioned  work  of  Osborne  et  al. 
(1986),  the  authors  argued  that  the  fractal  characteristics  of  drifter  trajectories  v/as 
attributable  to  underlying  stochasticity  (power  law  energy  spectrum  of  the  velocity  field) 
rather  than  being  associated  with  a  strange  attractor.  This  work  led  to  several  studies  on 
the  relationship  between  stochasticity  and  fractal  behavior. 

LINEAR  OCEAN  WAVES  IN  THE  GEOMETRIC  LIMIT 

In  the  geometric  (ray  theoretical)  limit,  any  type  of  linear  wave  motion  can  be  described 
using  a  ray  approximation  (see,  e  g.,  Lighthill,  1978).  Such  a  description  is  valid  when  the 
properties  of  the  ocean,  including  its  boundaries,  vary  slcwly  on  a  scale  of  wavelengths. 
The  ray  equations  are 


dXi  _  dw  dk-  _  dw 
dt  dk-  ’  dt  dx- 

where 

(0  =  a)(k,x).  (5) 

Here  the  jCj's  are  position  coordinates  and  the  k-s  are  the  corresponding  components  of  the 
wavenumber  vector.  The  form  of  the  function  0)f^,  x) — the  dispersion  relation — depends 
on  the  type  of  wave  being  considered.  For  example,  for  surface  gravity  waves 
propagating  in  water  of  variable  depth  h(^  =  h{x,yX 
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(oik.x)  =  [^1*1  tanh(|*|/i(jr))]‘^^ 

In  the  following,  some  important  ideas  are  illustrated  using  this  form  of  the  dispersion 
relation.  We  emphasize,  however,  that  equations  (4)  and  (5)  are  very  general  and  that  the 
following  considerations  are  applicable  to  any  type  of  linear  wave  motion. 

Equations  (4)  and  (6)  constitute  an  autonomous  Hamiltonian  system  with  two  degrees  of 
freedom;  ci)(fe  x)  serves  as  the  Hamiltonian  H(^  g).  Autonomous  means  that  the 
Hamiltonian  function  does  not  depend  explicitly  on  the  dependent  variable,  time. 
Integrability  of  such  a  system  requires  that  two  independent  constants  of  the  motion  exist. 
One  of  these  is  co^  it  follows  from  equations  (4)  that  doa/dt  =  0.  Only  for  very  special 
bathymetric  variations  h(x,y)  does  the  second  required  constant  of  the  motion  exist.  For 
example,  if  A  =  h(x),  then  it  follows  from  the  second  of  equations  (4)  that  dk/dt  =  0;  under 
such  conditions  is  a  second  constant  of  the  motion.  Such  behavior  is  not  typical, 
however. 

In  the  absence  of  a  second  constant  of  the  motion,  the  possibility  of  chaotic  ray  motion 
exists.  Numerical  experiments  strongly  support  the  expectations  that,  under  such 
conditions,  ray  trajectories  exhibit  chaotic  behavior  (see,  e  g..  Brown  et  al.,  1991;  Smith  et 
al.,  1992;  Abdullaev  and  Zaslavskii,  1989).  These  studies,  however,  assume  spatially 
periodic  ocean  properties.  This  assumption  allows  readily  available  mathematical  tools  to 
be  exploited.  Unfortunately,  a  similar  set  of  tools  is  not  available  to  treat  problems 
involving  more  realistic  (nonperiodic)  ocean  structure.  Chaotic  behavior,  which 
presumably  persists  in  some  form  in  realistic  ocean  environments,  is  characterized  by 
exponential  growth  of  small  errors  and  leads  unavoidably  to  the  conclusion  that,  under 
such  conditions,  predictability  of  ray  trajectories  is  limited  to  small  times. 


Does  this  imply  a  lack  of  predictability  of  the  corresponding  wavefield?  Probably  not. 

The  reason  is  that  the  ray  description  of  the  wave  motion  is  a  nonlinear  approximation  to  a 
linear  wave  equation.  (For  the  system  described  by  (4)  and  (6)  the  corresponding  linear 
wave  equation  is  the  mild  slope  equation — see,  e  g.,  Mei,  1983).  Because  nonlinearity  is  a 
necessary  condition  for  chaos,  the  linear  wave  equation  does  not  admit  chaotic  solutions. 
These  solutions  may  have  different  properties,  however,  depending  on  whether  the 
corresponding  ray  trajectories  are  chaotic  or  not.  (There  is  a  vast  literature  on  the 
corresponding  quantum  mechanical  problem — see  Reichl,  1992,  for  an  excellent  recent 
review.)  Wavefield  statistics,  for  example,  may  be  very  different  depending  on  whether 
the  corresponding  ray  trajectories  are  chaotic  or  not. 

It  should  also  be  noted  that  for  ocean  waves  the  linear  wave  equation  is  itself  an 
approximation  to  a  nonlinear  wave  equation.  This  leads  to  more  questions.  Does  the 
nonlinear  wave  equation  admit  chaotic  solutions,  and,  if  so,  is  there  any  connection 
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between  this  chaos  and  chaotic  behavior  in  the  corresponding  ray  trajectories?  The 
answers  are  currently  not  known. 

FINAL  REMARKS 

Our  discussion  of  the  topics  in  the  three  preceding  sections  led  to  a  number  questions 
relating  to  chaotic  ocean  dynamics.  For  the  most  part,  the  questions  corresponding  to  the 
different  topics  were  not  the  same.  This  is  consistent  with  our  view  that  chaotic  ocean 
physics  should  not  be  treated  as  a  unified  branch  of  ocean  physics.  Rather,  results  from 
studies  of  low-order  dynamical  systems  should  be  thoughtfully  applied  to  selected 
problems  in  ocean  physics  in  a  manner  which  complements  more  traditional  approaches  to 
the  same  problem. 

Not  surprisingly,  we  have  seen  that  the  ocean  is  more  complicated  than  the  systems 
normally  studied  in  the  context  of  nonlinear  dynamics.  This  discrepancy  should  be  viewed 
as  a  challenge  to  both  physical  oceanographers  and  nonlinear  dynamicists;  both  groups 
stand  to  benefit  from  collaborating.  The  example  given  earlier  of  Osborne  et  al.'s  (1986) 
work  motivating  studies  on  the  relationship  between  stochasticity  and  fractal  behavior  is 
an  excellent  example  of  precisely  this  type  of  symbiotic  relationship. 
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Abstract 

The  theory  of  dissipative  chaos  appears  to  promise  great  insights  into  the  behavior  of 
natural  systems  like  the  ocean.  Results  based  upon  model  simulations  show  the  possibility 
that  phenomena  such  as  El  Nifio  are  chaotic.  Chaotic  phenomena  also  demonstrate  that 
certain  traditional  methods  are  not  appropriate  for  chaotic  systems.  For  example  a 
perturbation  from  the  linear  solution  provides  no  insight  into  the  behavior  of  the  nonlinear 
system  if  that  system  is  chaotic,  even  if  the  nonlinear  terms  are  small.  The  existence  of 
chaos  implies  an  inherent  limit  to  the  predictability  of  a  system,  this  is  one  reason  why  it  is 
important  to  determine  if  a  system  is  chaotic. 

However,  when  one  attempts  to  make  estimates  of  measures  of  chaos  (dimensions, 
Lyapunov  exponents,  etc.)  from  oceanographic  data  one  is  faced  with  the  fact  that  the 
methods  that  quantify  chaotic  properties  of  systems  from  data  require  an  enormous 
number  of  degrees  of  freedom  for  any  reasonable  degree  of  confidence.  Again  traditional 
analysis  techniques  can  make  matters  worse  and  not  better.  An  example  of  this  is  the  use 
of  a  smoothing  filter:  the  filter  can  increase  the  dimension  of  the  resulting  data  set  by  as 
much  as  1 . 

1  What  chaos  might  contribute 

There  are  several  ways  that  ideas  from  chaotic  dynamics  may  contribute  to  an 
understanding  of  the  ocean.  The  primary  question  is  whether  or  not  any  oceanic 
phenomena  are  chaotic. 

If  an  oceanic  phenomenon  is  chaotic,  that  will  automatically  impose  inherent  limits  to  the 
predictability  of  the  system.  If  this  is  so,  it  is  important  to  be  able  to  quantify  what  the 
predictability  limit  is. 

1 . 1  Are  phenomena  such  as  El  Nino  chaotic? 

A  first  question  to  ask  is  whether  any  oceanic  phenomena  are  actually  driven  by  chaotic 
dynamics.  The  identification  of  chaos  in  the  ocean  would  mean  that  the  relatively 
complicated  behavior  that  is  observed  could  be  described  in  terms  of  a  system  with  a  small 
number  of  degrees  of  freedom.  This  possibility  that  El  Nifio  is  chaotic  has  been 
investigated  by  looking  at  the  available  data  (Fraedrich,  1988),  and  by  model  studies 
(Vallis,  1986). 
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Figure  1.  The  Southern  Oscillation  Index,  a  monthly  time  series  of  sea  level  pressure  differences  between 
Tahiti  and  Darwin,  Australia  (These  data  are  scaled  to  standardized  dimensionless  units  so  that  the  series 
has  a  zero  mean  and  a  unit  standard  deviation.) 


Figure  1  shows  the  Southern  Oscillation  Index;  its  irregularity  is  visually  reminiscent  of 
chaotic  time  series.  This  time  series  has  fewer  than  500  data  points  in  it,  which  is 
unfortunately  too  few  to  make  reliable  calculations  of  the  dimension  of  the  underlying 
system.  Model  studies  of  El  Nino  indicate  that  it  is  possible  to  mimic  time  series  such  as 
the  Southern  Oscillation  Index  with  models  that  are  chaotic.  Figure  2  shows  the  Vallis 
(1986)  model.  This  very  simple  model  produces  an  El  Nino  event  with  about  the  right 
periodicity.  The  system  is  chaotic  and  has  a  Lyapunov  dimension  of  2.088  (see  Fig.  3). 

1 .2  Chaotic  Lagrangian  trajectories? 

The  irregular  nature  of  drifter  trajectories  is  suggestive  of  either  turbulence  or  chaos, 

(see  Fig.  4).  The  possibility  that  these  trajectories  are  fractal  has  been  investigated  by 
several  people  (Osborne,  Brown  and  others).  The  major  problem  with  these  analyses  is 
that  the  data  records  are  short  (typically  about  1000  points),  while  the  methods  used  in 
chaotic  analysis  require  one  or  two  orders  of  magnitude  more  data  for  confident  estimates. 
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Figure  2.  The  Vallis 
(1986)  ENSO  model. 
Top:  west-east  section  of 
the  equatorial  Pacific 
Ocean,  defining  symbols 
used  in  the  model. 
Center;  model 
equations.  Bottom:  the 
chaotic  attractor 
resulting  from  the  model 
equations  with 
parameters /!=  1  year' 
and  B=2m*2  s‘2  °C‘*. 
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Figure  3.  The  Lyapunof  spectnun  of 
the  Vallis  attractor.  The  panels  show 
the  convergence  of  a  numerical 
estimate  of  the  respective  Lyapunov 
exponents  as  a  fuiumon  of  time.  The 
noted  asymptotic  value  is  the  final 
estimate  of  the  exponent.  The  time 
units  are  nondimensional  and 
conespond  to  one  unit  being 
equivalent  to  one  week.  The  Lyapunov 
dimension  (calculated  using  the 
Kaplan-Yorke  equation  (28))  of  this 
system  is  =2.087. 
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Early  calculations  by  Osborne  et  al.  (1986)  for  a  year  of  measurements  of  three  surface 
drifters  indicated  a  correlation  dimension  of  about  14.  More  recent  calculations  on 
SOFAR  float  trajectories  (Brown  and  Smith,  1990)  are  more  ambiguous.  Based  on 
available  observations,  the  current  conclusion  is  that  float  trajectories  are  probably  not 
chaotic.  They  are  more  likely  to  be  controlled  by  turbulent  processes. 

1.3  Limits  to  predictability 

If  a  system  is  chaotic,  then  trajectories  that  are  nearby  in  phase  space  will  diverge 
exponentially.  Increasing  the  accuracy  of  the  observations  does  not  help,  since 
predictability  only  increases  linearly  with  the  number  of  digits. 

Another  possible  situation  that  can  impose  limits  on  predictability  is  the  possibility  that  the 
boundary  between  the  states  of  the  ocean/atmosphere  is  fractal.  As  an  illustration  of  this 
possibility,  consider  the  determination  of  the  basins  of  attraction  (i  .e.,  the  root  that  is 
reached  for  a  given  starting  point)  for  the  problem  of  finding  the  roots  of 

z'-l  =  0 

for  complex  r,  by  using  Newton's  method.  Here  Newton's  method  for  this  complex 
polynomial  is  the  “physics”  for  a  system  which  ultimately  reaches  one  of  three  states.  It 
turns  out  that  the  boundaries  of  the  regions  that  reach  a  given  root  are  fractal  and  have 
the  remarkable  property  that  any  boundary  point  is  a  boundary  between  all  three  domains, 
these  boundary  points  define  a  set  known  as  a  Julia  set  (see  Fig.  5).  The  implication  for 
predictability  is  that  for  measurements  with  a  given  finite  error,  there  are  some  regions  that 
are  perfectly  predictable  and  other  regions  where  there  is  no  predictability  at  all. 

1 .4  Perturbation  expansion  of  chaotic  models 

One  common  technique  in  solving  nonlinear  systems  is  to  do  a  perturbation  expansion 
about  some  small  parameter.  We  demonstrate  here  that  a  conventional  perturbation 
expansion  may  not  be  helpful  when  the  system  is  chaotic  because  the  perturbation  solution 
has  no  chaotic  behavior. 

Look  at  the  Lorenz  system  of  equations. 


x  =  a{y-x) 
y  =  -y-xz+rx 
z  =  xy-bz. 


(1) 

(2) 

(3) 
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Figure  5.  The  basins  of  attraction  for  the  roots  of  z  =  (- 1  -  i4i)  /  2) ,  for  complex  z,  using  Newton's 
method.  The  starting  points  that  converge  to  the  root  z  =  1  are  colored  grey,  points  that  converge  to  the 
root  z  =  (-1  +  iVi)  /  2),  and  points  that  converge  to  the  root  z  =  (-1  -  iV3)  /  2)  are  black.  (The  center  of 
the  figure  is  at  the  origin.) 


The  parameter  r  is  the  ratio  of  the  Rayleigh  number  divided  by  the  critical  Rayleigh 
number.  The  parameter  a  is  the  Prandtl  number.  The  third  parameter  b  is  related  to  the 
horizontal  wave  number  of  the  system.  Typical  values,  r=  28,  o  =  10,  =  8/3,  dimension  = 
2.05.  A  common  second  set  of  values,  r  =  45.92,  o  =  16,  6  =  4,  dimension  =  2.067. 

The  interesting  cases  are  where  the  Rayleigh  number  ratio  r  is  large,  which  suggests  that 
we  could  expand  the  system  of  equations  around  a  parameter  proportional  to  the 
reciprocal  of  r  (which  would  be  small). 
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If  we  define 
and  let 


e  =  r  ' 

x'  =  £x 
y'  =  ^ay 
z  =  <T(e^z  - 1) 


Then  equations  (1)  -  (3)  become  (after  dropping  the  primes) 


x-y~eax 
y  =  -xz  -  ey 
z  =  xy  -eb(z  +  a). 

Now  consider  the  expansion  of  x,y,  and  z  in  terms  of  the  parameter  e 

X  =  Xq  +  Gt|  +  •  • 

>'  =  >'o  +  €>'i  +  e^>’2+-" 

z  =  2b  +  ez,  +  e^22+'‘'* 

Introducing  (9)  in  (6)  -  (8),  gives  the  order  0  equations. 


(4) 

(5) 


(6) 

(7) 

(8) 


(9) 


and  at  order  e,  the  system. 


^=yo  (10) 

yo  =  -^^  (11) 

4='»fo)'o  (12) 

(13) 

(14) 

i,  =Xoy,  +  jc,yo-f>(2b  +  ‘y)-  (15) 


The  interdependence  of  the  order  0  equations  can  removed  with  some  algebraic 
manipulation.  Use  (1 1)  and  (12)  to  eliminate  x^ 


yoyo+^^=o 


(16) 


Integrating  this. 
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yo  =  -2^+[>'o(0)+4(0)l  (17) 

where  the  terms  in  the  brackets  of  equation  (17)  are  the  initial  values  of  and  Zq  We  will 
define  this  (constant)  term  as 

C23=)'o(0)+2o(0)  (18) 

If  we  go  back  to  (10)  and  (12)  to  eliminate  yg, 

.  1  .2 


Integrating  this  gives 


1  2  fl 

^  =  -j£b-  -->^0(0)-^  ; 


here  the  terms  in  the  brackets  of  equation  (20)  are  the  initial  values  of  and  .  We  will 
define  this  term  as 

Q  =  ^JCo(ll)-^(0)  (21) 


Using  (17)  in  (10)  gives 
Now  using  (19) 


(io)^  = 


=  --4+c„4+[c,,-c^,]. 


Given  the  solution  to  this  equation,  can  be  solved  for  by  using  (10).  Then  given 
and  Jo,  Zc  can  be  solved  for  by  using  (12).  An  equation  for  z^  can  also  be  derived  by 
using  manipulations  similar  to  that  used  in  deriving  (23)  (using  equations  (17)  and  (20)  in 
(12)  to  eliminateAi;,  and  yg)  ,  to  give 

(^)  =  ~2^  —  2CijZq  +  2C^32()  +  2Cj3C^3.  (24 

Equation  (22)  can  be  solved  analytically,  its  solution  is  a  Jacobi  elliptic  function 

x^  =  Asn(r)m). 

The  other  components  can  also  be  detemuned. 


yg  =  Acn(r  jm)  dn(r  j/n) 

Zg  =  dn^(r|m)  mcn^(r|m) 


(26) 
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(where  m  =  -fii  /  4)  so  the  system  is  well  behaved  (not  chaotic).  The  first  order  equations 
(13)  -  (15)  are  linear  so  they  cannot  possibly  lead  to  chaotic  solutions.  Thus  we  have 
shown  that  while  the  actual  system  can  be  chaotic,  the  perturbation  solutions  may  not  be 

Figure  6  shows  a  phase  portrait  of  the  solution  of  the  full  system  (6)  -  (8),  the  zero  order 
system  (10)  -  (12),  and  the  perturbation  solution  to  first  order  (i.e.,  with  e  times  the 
solution  of  (13)  -  (15)  added  to  the  zero  order  solution).  The  perturbation  solution  tracks 
the  nonlinear  solution  for  a  short  while  then  it  moves  off  in  a  different  direction.  The 
perturbation  solution  also  rapidly  grows  to  order  one,  so  that  the  expansion  (9)  is  valid  for 
only  a  limited  time. 

3 
2 
1 
0 

-1 
-2 
-3 
-4 
-5 


Figure  6.  Solutions  to  the  Lorenz  equations  for  large  Rayleigh  number  ratio  (equations  (6H8)).  The  solid 
line  is  the  solution  to  the  nonlinear  (chaotic)  system.  The  long  dashed  line  is  the  solution  to  the  zero  order 
perturbation  experiment.  The  short  dashed  line  is  the  perturbation  solution  to  first  order. 

2.  Practical  problems  in  estimating  chaotic  parameters  from  actual  data 

Most  methods  developed  for  quantifying  chaos  (e  g  ,  the  Grassberger-Procaccia  (1983) 
method  )  require  very  long  data  sets  in  order  to  converge  with  a  reasonable  uncertainty. 
Such  lengthy  data  sets  do  not  exist  in  oceanography,  so  methods  that  work  with  short  data 
sets  (see  for  example,  Ellner  (1988),  Havstad  and  Ehlers  (1989)  or  Abraham  et  al.,  1986) 
must  be  used.  Also  the  presence  of  noise  (either  due  to  measurement  errors  or  to  small 
scale  oceanic  process)  complicates  the  calculations.  In  addition,  the  ill-considered  use  of 
filters  applied  to  the  data  can  make  things  worse,  not  better. 
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2. 1  The  effect  of  noise 

Random  errors  in  the  observations  of  a  system  can  complicate  the  estimation  of  the  fractal 
dimension  of  a  system.  It  has  the  effect  of  increasing  the  apparent  dimension  of  the 
system.  This  is  unfortunate  since  estimation  methods  have  data  requirements  that  grow 
exponentially  with  the  dimension  of  the  system. 

In  addition,  while  truly  random  processes  ought  to  be  infinitely  dimensional,  biases  in 
commonly  used  dimension  algorithms  indicate  finite  dimension  when  presented  with 
random  data. 

For  colored  noise,  the  correlations  between  nearby  points  can  produce  effects  that  mimic 
a  finite  correlation  dimension  (Theiler,  1991).  Osborne  and  Provenzale  (1989)  provide  an 
example  of  this  effect.  Kennel  and  Isabelle  (1992)  have  investigated  the  possibility  of 
distinguishing  colored  noise  effects  from  chaos. 

2.2  The  effect  of  filtering  the  observations 

One  traditional  way  to  deal  with  noise  in  the  observations  is  to  apply  a  filter  in  an  attempt 
to  remove  the  frequencies  that  are  attributed  to  the  noise.  With  chaotic  systems,  the  effect 
of  the  filter  is  to  potentially  increase  the  apparent  dimension  of  the  system  (Badii  et  al  ., 
1988). 

Consider  a  physical  system  u(t)  =  -F(u)  and  an  ideal  lowpass  filter,  which  can  be 
described  as  a  differential  equation  that  adds  to  the  original  system; 

i(t)  =  -Jiz(t)  +  X(t)  (27) 

where  z(t)  is  the  filter  output,  and  q  is  the  filter  cutoff  frequency. 

With  this  filter  present,  the  Lyapunov  exponents  of  the  system  consist  of  the  original 
Lyapunov  exponents  plus  a  new  one  resulting  from  the  filter. 

From  the  Kaplan- Yorke  equation  for  the  Lyapunov  dimension 


The  dimension,  of  the  system  will  remain  unchanged  as  long  as 
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Otherwise  the  dimension  of  the  filtered  system  will  increase.  In  fact,  depending  upon  the 
size  of  r|  compared  to  the  other  Lyapunov  exponents,  can  increase  as  much  as  1 .  There 
has  been  some  work  (e  g.,  Chennaoui  et  al.,  1990)  to  remove  this  effect  of  filtering  on 
chaotic  time  series  by  (at  least  in  a  topological  sense)  unfiltering  the  time  series. 

3  Methods  from  systems  dynamics 

Even  if  it  turns  out  that  the  ocean  is  not  chaotic,  certain  techniques  developed  for 
analyzing  chaotic  systems  may  prove  useful.  For  many  of  these  methods  the  fact  that  a 
nonlinear  system  is  a  chaotic  one  is  not  essential  for  the  analysis  method  to  be  usable. 

3.1  Mutual  information  and  dynamical  connections 

The  mutual  information  of  two  (discrete  scalar)  messages  S  and  Q  is  (Fraser  and  Swinney, 
1986) 


I(Q,S)  =  H(QHHiS)-H{Q,S) 

(30) 

where 

(31) 

t 

(and  similarly  for  S) 

m.s) = 

(32) 

When  0  is  a  set  of  time  delayed  measurements  {q{t+  T))  then  the  first  minimum  of  /  as  a 

function  of  T  is  a  good  choice  of  the  lag  time  in  the  higher  dimensional  reconstruction 
(Fraser,  1986). 

By  taking  the  appropriate  limits,  we  can  calculate  an  information  dimension  from  the 
mutual  information 

A=0,  +  A-/>,,  (33) 

D,  is  nonnegative  and  has  the  following  properties. 

D,  =  A  when  q  =  s 

D,  <  A  when  q  and  s  are  time  shifted  versions  of  each  other  or  when  they  are 
dynamically  related  (and  have  the  same  dimension) 

D,=0  when  q  and  s  are  dynamically  independent. 
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Hence  we  have  a  test  for  synchronizabiiity  and  for  dynamical  relatedness.  This  could 
be  exploited  to  determine  if  two  different  time  series  (say  one  from  a  model  and  another 
from  actual  observations)  are  controlled  by  the  same  dynamics  or  not. 

3.2  A  theorem  on  dynamic  dependence 

The  dimensions  and  entropies  of  series  can  also  be  used  to  determine  whether  two  systems 
are  dynamically  independent  or  not  The  following  theorem  is  due  to  Hartt  and  Kahn, 

1990. 


Consider  a  composite  system 

where 

).  +  r),  -  y  (r,  +  (^/  -  /  - 1)  T)f 

x,-  =  [Z(/. ),  Z(r.  +  T).-  •  •  Z(r,  +  (/  - 1)  T)f 

v.ith  combined  dimension  of  d,{{d- f)  +  {f)).  We  investigate  the  effects  of  the 
dependence  and  independence  of  these  subsystems.  The  supremum  norm  gives 

Pdi.(*»y)  ~  ~  max  |■^o^  “ 

where  k  represents  a  component.  It  follows  that 

=  max{p^{i,j),p,{i,j)). 

The  simplest  way  to  obtain  dimensions  and  entropies  is  to  evaluate  the  generalized 
correlation  integrals 


(39) 

where  =  number  of  reference  points  and  =  number  of  sample  points  in  the  vector 
time  series.  Then 


Q(0  = 


(34) 

(35) 

(36) 

(37) 

(38) 


0{e-pjij))  =  9{e~p^{i,jme-p,ii,j)).  (40) 

There  are  two  important  special  cases: 

•  Identical  subsystems 

•  Independent  subsystems 
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3.2.1  Identical  subsystem 


In  this  case  =  ft ,  and  (/  =  -  =  -  /).  Here,  6(^  -  p,  (i,  j))d{t  -  ft  (i, »)  =  d(e-p,  (i,  j)) 
and  fo*"  all  g  and  (  .  Asymptotically  for  f  ^  0, 


(f )  -  In  r-  exp{-dr  Klid,  z)) 


and  similarly  for  Then 


\nqi,A()=v^lne-dzK:,{d,z) 

=  v,ln^-fTlC(4) 


from  which  we  arrive  at 


3.2.2  Dynamically  independent  subsystems 

Here,  pA^J)  takes  values  that  are  independent  of  p„{/,y).  Then  0{£  - p^ij))  can  be 
replaced  by  its  average  value  over  the  entire  series.  The  cases  q  =  1  and  qr  =  2  are 
especially  important.  In  both  of  these  cases  it  follows 


from  which  asymptotically. 


and  in  the  case  —  =  f  =  d-f, 
2 


Clearly,  C(ji,})  <  C(i). 
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4  Summary 

•  Several  oceanic  phenomena,  El  Niho  and  drifter  trajectories  in  particular,  are 
suggestive  of  chaos.  For  El  Niflo,  the  presence  of  chaos  is  inconclusive.  Drifter 
trajectories,  on  the  other  hand,  are  probably  not  chaotic. 

•  Limitations  on  the  quantities  of  data  have  prevented  a  definitive  conclusion  on  the 
existence  of  chaos  in  the  ocean. 

•  The  existence  of  chaos  means  that  special  care  must  be  used  when  dealing  with  both 
the  equations  and  the  data. 

•  The  properties  of  dynamically  connected  chaotic  systems  may  be  useful  in  identifying 
the  dynamical  system. 
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ABSTRACT 

The  physical  parameters  that  are  important  to  oceanographers  often  have  a  stochastic  nature 
and  can  be  represented  as  the  sum  of  a  deterministic  average  and  a  random  component  of  zero 
mean.  Coastline  shapes,  water  depth  and  fluid  density  are  examples  of  such  quantities.  When 
the  random  components  are  small,  perturbation  methods  can  be  used  to  calculate  their  effects 
on  the  mean  flow.  However,  in  certain  cases  it  is  the  derivative  of  the  random  component  which 
is  of  importance  and  that  can  have  a  very  large  magnitude.  Consequently,  the  ostensibly  small 
stochastic  part  may  well  be  more  influential  than  the  stTX)oth  average  component.  In  this  paper 
we  present  a  technique  for  quantifying  roughness  that  can  be  easily  implemented  for  experimental 
data  sets  and  apply  the  method  to  some  bathymetric  examples.  Moreover,  to  examine  how 
such  randomness  will  influence  ocean  flows  we  consider  the  problem  of  predicting  the  dispersion 
relations  for  topographic  Rossby  waves  propagating  in  the  presence  of  a  rough  ocean  floor.  The 
random  depth  and  its  derivative  act  as  coefficients  in  the  equations  governing  topographic  Rossby 
waves.  In  this  paper  we  analytically  and  numerically  examine  the  solutions  to  those  equations 
and  consider  how  they  change  as  the  roughness  of  the  bottom  increases. 


1.  INTRODUCTION 

Many  physical  characteristics  of  importance  to  the  oceanographer  have  a  stochastic 
nature  and  can  be  represented  as  the  sum  of  a  deterministic  average  and  a  random 
component  of  zero  mean.  Quantities  that  come  to  mind  include  coastline  shapes,  water 
depth  and  fluid  density. 

When  the  random  components  are  small,  perturbation  methods  can  be  used  to 
calculate  their  effects  on  the  mean  flow  (see  for  example  Mysak  (1978)).  However,  in 
certain  cases  it  is  the  derivative  of  the  random  component  which  is  of  importance  and 
that  can  have  a  very  large  magnitude.  Consequently,  the  influence  of  the  ostensibly 
small  stochastic  part  may  well  be  of  the  same,  or  even  larger  order,  than  that  of  the 
smooth  average  component 

For  example,  it  has  long  been  recognized  that  variations  in  the  sea  floor  topography 
allows  fra:  the  propagation  of  a  class  of  disturbances  known  as  topographic  Rossby 
waves  (sec  for  example  Pedlosky  (1989)).  These  flows  are  spatially  extensive  and  have 
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lar^  temporal  periods.  The  critical  coefficient  in  the  governing  equations  fw  such 
waves  depends  on  the  derivative  of  the  undisturbed  water  depth.  This  depth  might  well 
be  considered  random  and  rather  small  (0(1)  km.)  when  compared  to  the  spatial  extent 
of  the  waves  in  question  (0(100)  km.).  However,  the  derivative  of  the  depth  which 
appears  in  the  equation  can  not  be  treated  as  a  small  term. 

In  past  studies,  the  ocean  floor  was  often  treated  as  a  plane  with  a  slight  slope  and 
indeed  mathematical  analysis  then  predicts  the  existence  of  topographic  waves.  While  it 
true  that  many  regions  of  the  ocean  can  be  characterized  by  having  a  small  mean  slope, 
it  is  not  apparent  that  one  can  ignore  other  variations  in  topography  in  favor  of  this  slope. 

A  natural  question  arises  then  as  to  whether  one  can  quantitatively  characterize  the 
roughness  inherent  in  bathymetric  data.  Can  one  give  some  reasonable  estimate  of  the 
“average  slope”  of  such  a  data  set?  In  recent  years,  fractal  techniques  have  proved 
popular  in  similar  quests  by  researchers  in  many  fields.  However,  dimension  estimates 
depend  on  infinite  scaling  properties  that  are  often  not  physically  justifiable  as  real 
surfaces  are  self  similar  only  over  a  limited  range  of  scales.  Moreover,  as  a  practical 
matter,  fractal  analyses  are  simply  not  formulated  with  discrete  data  sets  in  mind. 

In  this  paper  we  take  an  alternative  approach  based  on  the  geometric  thermodynamic 
theory  for  curves  and  surfaces.  The  essential  ideas  are  developed  in  the  next  section 
and  are  illustrated  there  by  a  number  of  simple  examples.  We  merely  note  now  that  the 
theory  allows  one  to  compute  a  temperature  for  a  curve  or  surface.  The  temperature 
of  a  straight  line  is  zero  and  the  value  increases  as  the  curve  roughness  increases.  The 
quantity  can  be  successfully  measured  even  for  data  sets  of  finite  resolution  and  efficient 
algorithms  to  accomplish  this  are  presented  in  section  3. 

Tools  to  analyze  rough  data  sets  are  very  useful  but  it  is  even  more  intriguing  to 
apply  those  tools  to  predict  how  roughness  influences  oceanographic  flows.  To  that 
end  we  take  up  the  problem  of  computing  the  dispersion  relations  that  govern  linear 
topographic  Rossby  wave  disturbances  for  an  ocean  of  random  depth.  The  governing 
equations,  some  properties  of  their  solutions,  and  the  numerical  methods  used  to  solve 
them  are  described  in  section  4. 

To  render  the  problem  computationally  tractable,  we  restrict  attention  to  topographies 
that  vary  in  one  direction  only,  i.e.  we  consider  oceans  with  corrugated  floors.  Past 
investigatitms  by  Thomson  (1975),  Odulo  and  Pelinovsky  (1978)  and  others  have  shown 
that  if  the  waves  are  constrained  to  propagate  in  the  same  direction  as  the  bottom  relief 
then  even  for  simple  floors  with  periodic  ripples  wave  dissipation  and  reflection  are 
observed.  In  particular,  the  second  study  showed  that  the  characteristic  damping  time  for 
Rossby  waves  is  ~  {Ah/ ho)  .  Typical  values  fw  the  ocean  are  fairly  large — ^in  the  4 

months  to  3  years  range.  In  this  paper  we  will  consider  waves  propagating  in  a  direction 
that  is  not  parallel  to  the  botttnn  topography  and  in  this  case  the  waves  do  not  dissipate. 

It  should  be  pointed  out  from  the  start  that  while  the  ocean  floor  can  be  modelled 
stochastically,  it  is  far  removed  from  a  white  noise  state.  In  this  paper  most  of  the 
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simulations  were  done  ftn*  synthetic  bottom  profiles  though  some  preliminary  analysis 
has  been  carried  for  data  sets  collected  in  the  North  Atlantic  and  Pacific.  Some  of  the 
methods  we  used  to  synthesize  rough  bott<»n  profile  are  presented  in  section  S  and  their 
geometric  thermodynamic  characterisdcs  are  computed. 

In  section  6  we  present  results  fw  bathymetries  of  various  temperatures. 

2  GEOMETRIC  THERMODYNAMICS 

In  this  section  we  explain  how  one  can  formulate  a  thermodynamic  theory  for 
geometric  objects  and  how  one  can  use  that  theory  to  construct  quantitative  estimates  of 
the  “roughness”  of  curves.  We  illustrate  the  concepts  with  a  number  of  simple  examples. 


Figure  1  Random  straight  line  m  intersecting  curve  F  at  three  points.  The 
convex  hull  of  F  is  the  set  of  points  that  lie  inside  the  dashed  boundary  line. 

The  fundamental  quantity  we  measure  for  a  curve  is  the  average  number  of 
intersections  it  has  with  randomly  chosen  straight  line  segments.  In  general,  the  rougher 
the  curve,  the  larger  this  number  will  be.  To  formalize  the  idea  let  T  be  a  rectifiable 
curve  in  the  plane  and  let  n(r)  be  the  set  of  all  straight  lines  intersecting  F.  Directly 
measuring  the  number  of  intersections  between  a  random  element  u;  €  n(r)  and  F  is 
a  computationally  intensive  task  but  Blaschke  (1936)  has  shown  that  if  one  picks  u;  at 
random,  with  the  natural  (and  as  it  turns  out  unique)  distribution  m  that  is  invariant 
under  rigid  motions  of  the  plane,  then  the  average  number  of  intersections  between  the 
line  u;  and  the  curve  F  is  given  by 


2|F| 

I^Kl’ 


(1) 


where  the  convex  hull  of  F,  K  has  boundary  dK  and  we  use  H  to  denote  the  length  of 
a  curve.  A  detailed  definition  of  the  term  convex  hull  is  given  in  the  next  section  but 
its  intuitive  meaning  should  be  clear  from  figure  (1). 
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An  easy  derivation  of  Blaschke’s  formula  fca  smooth  (piecewise  differentiable) 
curves  can  be  found  in  Santalo  (1976)  and  depends  on  the  formula 


l‘l  X 

J  nriw)(L;  =  j  ds  J  \sm  e\  dO  =  2\r\. 


Here  the  line  u;  is  parameterized  by  its  perpendicular  length,  s,  from  the  origin  and 
by  the  angle  0  subtended  by  the  normal  to  u  with  the  x  axis.  np(u;)  is  defined  as  the 
number  of  points  at  which  the  line  w  intersects  F  (nr(w)  =  3  in  figure  (1)).  Some 
manipulation  of  this  formula  quickly  gives 

/i.pxx  /  nr(w)  du  =  (3) 

m(n(r))  |aK| 

as  claimed  at  the  beginning  of  the  section. 

Steinhaus  (1954)  observed  that  while  the  quantity  on  the  right  hand  side  of  (3)  only 
makes  sense  for  rectifiable  curves,  the  left  hand  side,  representing  the  average  number 
of  intersections  with  lines,  makes  sense  for  any  planar  set,  whatever  its  complexity. 
The  set  need  not  be  a  curve  representing  a  single  valued  function  or  even  a  curve  at 
all.  He  then  suggested  that  the  average  be  considered  as  a  measure  for  the  “length”  of 
such  a  set.  This  is  the  starting  point  of  our  paper  and  suggests  a  way  of  measuring 
the  roughness  of  interfaces  that  can  be  much  more  general  than  those  described  by 
functions  of  one  or  two  variables. 

DuPain,  Kamae  and  Mendes-France  (1986)  extended  Steinhaus’s  approach  by 
applying  ideas  from  the  field  of  statistical  mechanics.  They  considered  the  family 
M*{r)  of  all  probability  measures  on  H(r)  which  gave  the  same  average  number  of 
intersections  of  lines  u  with  F  as  is  given  by  the  isotropic  homogeneous  measure  m. 
For  a  curve  F  which  has  the  property  that  for  any  positive  integer  k  there  exists  a  line 
u}  which  intersects  F  exactly  k  times,  one  can  associate  a  geometric  entropy  function 
<7  :  M*{T)  -4  R  by  defining 

OO 

«7(m)  =  —  ^  m(u; :  |u;  D  F|  =  A:)  log  m{u  :  |u;  D  F|  =  A:)  (4) 

k=i 

where  |u;  fl  F|  stands  for  the  number  of  intersections  between  u;  and  F. 

By  a  straightforward  application  of  the  method  of  Lagrange  multipliers  one  can 
then  find  a  “Gibbs”  measure  g  €  A/*(F)  which  maximizes  the  geometric  entropy  over 
Af*(F).  It  turns  out  that 

g{u; :  jc*;  fl  F|  =  A:)  =  Ce~^*,  (5) 


2|F| 

2|F|-|aKr 


where 


(6) 
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The  maximum  geometric  entropy  is  then 


+ 


c^-r 


(7) 


and  C  Ms  the  partition  function  with  C  =  —  1. 

Other  geometric  “thermodynamic”  quantities  can  easily  be  defined  including  the 
geometric  temperature  t  =  the  geometric  pressure  11  =  |dK|”\  the  geometric 
volume  V  =  |r|,  the  geometric  heat  Q  =  (e^  —  l)  and  the  geometric  free  energy 
F  =  log  (e^  —  l).  A  particular  quantity  that  we  shall  make  further  use  of  is  the 
geometric  internal  energy 


2|r|  e 

15Kt  e^-1' 


Although  the  construction  used  above  is  only  valid  for  a  limited  class  of  curves, 
the  quantities  ^  and  a  can  easily  be  extended  to  all  rectifiable  curves.  Mann,  Rains 
and  Woyezynski  (1991)  contains  further  details  of  the  application  of  these  ideas  to  the 
roughness  of  surfaces  but  in  this  paper  we  will  only  consider  one  dimensional  objects. 

Note  that  if  T  is  itself  a  straight  line  segment  then  U  =  \  and  t  =  0.  Thus  the  least 
interesting  curves,  straight  lines,  all  have  zero  temperatures! 

To  gain  some  familiarity  with  the  concepts  outlined  above  let  us  consider  some  other 
simple  examples  where  F  is  a  portion  of  an  infinite  curve  with  small  scale  roughness.  It 
is  then  reasonable  to  replace  |^K1  by  2L  where  L  is  the  distance  between  the  end  points. 
This  is  because  for  a  periodic  extension  of  F  the  convex  hull  is  an  infinite  strip  and  the 
section  of  9K  corresponding  to  F  has  perimeter  approximately  equal  to  2L.  In  a  later 
section  we  will  explain  algorithms  that  can  be  used  to  precisely  measure  the  convex  hull 
fw  more  complicated  situations.  With  that  approximation  U  =  |F|/Z/. 


Example  1:  Sawtooth  Curves 

Let  Fjv  be  the  periodic  sawtooth  curve  with  period  a  and  amplitude  eho  shown  on 
the  left  of  figure  (2). 


Figure  2  Symmetric  and  asymmetric  sawtooth  curves 
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Another  observation  will  be  of  some  consequence  later  is  that  the  profile  on  the  right 
of  figure  (2),  which  is  not  invariant  with  respect  to  a  change  of  direction  x  -x,  still 
gives  rise  to  the  same  values  far  U  and  S. 

Example  2:  Sinusoidal  Curves 

For  our  next  example  we  consider  the  sinusoidal  profile 

27r  \ 

1  +  esin — xj,  0<x<Na.  (13) 
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The  length  of  the  curve  is 


Setting  p  =  irSf  2  this  integral  can  be  expressed  in  terms  of  the  elliptic  function  of 
second  kind  E  in  the  form 


SO  that  in  the  first  approximation 


and  once  again  the  average  slope  is  proportional  to  —  1 


3.  MEASURING  THE  CONVEX  HULL 


A  domain  D  C  R"  is  said  to  be  convex  if  for  every  pair  of  points  pi ,  p2  €  D,  the 
line  segment  pjpj  is  entirely  contained  in  D.  Given  an  arbitrary  set  of  points  5  C  R", 
the  convex  hull  conv(5)  of  S  is  defined  to  be  the  smallest  convex  domain  containing 
S.  The  hull  of  a  bounded  set  will  always  be  a  convex  polytope. 

For  any  finite  set  of  points  on  the  plane  it  is  easy  to  visualize  the  convex  hull  by 
imagining  that  the  points  are  marked  on  a  board  with  protruding  nails.  To  find  the  hull, 
stretch  a  rubber  band  so  that  it  encloses  all  of  the  points  and  release  it.  The  band  will 
be  caught  on  the  nails  located  at  the  extreme  points  of  the  set  and  fcnm  the  polygonal 
boundary  of  the  convert  hull. 

In  order  to  apply  Blaschke’s  formula  (1)  to  general  sets  of  points  we  must  implement 
an  algorithm  fw  computing  the  convex  hull.  Several  algorithms  fen*  this  purpose  exist  fra* 
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planar  sets  of  points  and  we  will  merely  mention  a  couple  of  techniques  here.  The  reader 
is  referred  to  the  text  by  Preparata  and  Shamos  (1985)  for  further  details  of  the  theory. 

The  package  wrapping  technique  is  the  simplest  algorithm  for  extracting  the  subset  of 
the  points  that  form  the  convex  hull.  While  it  is  not  the  fastest  method  for  sets  of  points 
on  the  plane,  it  deserves  attention  because  it  is  one  of  the  few  that  can  be  generalized  to 
deal  with  higher  dimensional  data.  This  is  an  important  consideration  because  we  will 
eventually  want  to  handle  large  three  dimensional  sets  of  topographic  data. 

The  method  parallels  how  a  human  might  draw  the  boundary  of  the  convex  hull. 
Start  with  some  point  that  is  guaranteed  to  be  on  that  boundary,  say  the  one  with  smallest 
y  coordinate.  Fix  one  end  of  a  horizontal  line  to  this  point  and  rotate  it  upwards  until 
it  encounters  another  point  in  the  set.  That  point  must  also  belong  to  the  convex  hull. 
Use  it  as  a  new  anchor  for  the  horizontal  line  and  repeat  the  procedure.  Continue  in 
this  fashion  until  you  form  a  package  that  completely  wraps  around  the  original  set  of 
points.  The  package  is  precisely  the  boundary  of  the  convex  hull. 

Of  course,  instead  of  sweeping  horizontal  lines  around  to  see  which  point  in  the  set 
uiey  hit  first,  one  actually  looks  at  all  the  segments  between  the  current  anchor  and  the 
other  points  not  yet  accounted  for  by  the  convex  hull  boundary.  The  end  point  of  the 
segment  that  subtends  the  smallest  angle  with  the  x  axis  will  be  the  next  point  in  the 
hull  and  it  will  also  be  the  new  anchor.  The  major  computational  costs  associated  with 
the  algorithm  are  the  calculation  of  lots  of  angles  followed  by  some  form  of  sorting 
procedure  on  those  angles.  It  can  be  shown  that  the  technique  takes  0{N^)  operations 
for  sets  with  N  points.  The  constant  in  front  of  the  N'^  is  large  however. 

Several  improvements  on  the  basic  algorithm  can  be  made.  For  one  thing,  it  is 
possible  to  cheaply  eliminate  many  of  the  points  before  we  call  the  convex  hull  routine. 
One  way  to  do  this  is  to  construct  an  extreme  quadrilateral  by  searching  for  those  points 
in  the  set  that  have  the  largest  and  smallest  x  and  y  coordinates.  This  search  can  be 
done  in  ZN  operations  and  will  typically  yield  four  different  vertices.  Points  that  lie 
inside  the  region  defined  by  those  vertices  cannot  be  on  the  convex  hull  boundary.  By 
eliminating  them  (another  linear  time  process)  one  effectively  reduces  N,  the  number 
of  points  submitted  to  the  more  expensive  package  wrapping  technique.  If  one  happens 
to  know  something  about  the  distribution  of  the  set  of  points  even  better  quadrilaterals 
can  be  chosen  to  maximize  this  effect 

Additional  savings  come  from  the  realization  that  a  lot  of  time  is  spent  computing 
angles.  A  naive  implementation  might  calculate  0  =  arctan  {Ay /Ax)  but  evaluating  the 
arctangent  is  relatively  expensive,  particularly  on  RISC  machines  that  do  not  perform  the 
computation  in  hardware.  In  any  case,  the  precise  value  of  the  angle  is  of  little  interest 
here  as  we  are  only  using  it  as  a  key  in  the  sorting  process.  What  is  required  is  a  cheap 
alternative  to  the  arctangent  that  preserves  the  ordering  properties  of  that  function.  A 
good  candidate  for  this  purpose  is  Aj//(|Ay|  +  |Ai|)  with  appropriate  modifications  for 
positive  and  negative  values  of  Ax  and  Ay. 
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The  two  techniques  just  mentioned,  eliminating  '‘obvious”  points  and  replacing  an 
expensive  calculation  with  a  cheaper  one,  can  both  be  used  to  good  effect  for  data  in 
any  dimension.  Further  savings  are  possible  for  planar  points.  Graham  (1972)  suggested 
that  one  first  form  any  simple  closed  polygon  that  contains  all  the  points.  Having  found 
this  polygon  one  then  proceeds  to  eliminate  from  it  those  points  that  do  not  belong  to 
the  convex  hull.  The  major  cost  of  this  effort  is  the  initial  construction  of  the  closed 
polygon  and  this  is  done  by  a  sorting  procedure  based  on  angles  from  say  the  lowest 
point  in  the  set.  The  number  of  operations  for  the  Graham  scan  is  thus  dominated  by 
the  sorting  process  which  can  be  done  in  N  log  N  operations  for  N  input  points.  Even 
more  sophisticated  divide  and  conquer  algorithms  are  available  in  the  literature  but  we 
have  found  that  even  for  large  sets  of  data  the  Graham  scan  technique  coupled  with 
interior  elimination  provides  adequate  efficiency. 

Figure  (3)  shows  the  calculation  of  the  convex  hull  boundary  for  100  points  which 
were  chosen  to  be  uniformly  distributed  in  a  square. 


Figure  3  These  four  plots  show;  (a)  the  original  set  of  points,  (b)  the  set  with  the  “interior”  points 
removed,  (c)  the  package  wrapping  algorithm  in  progress,  (d)  the  completed  convex  hull  boundary. 


Roughness  Calculations  for  Bathymetric  Data 

The  topographic  wave  dispersion  relation  computations  carried  out  in  the  later 
sections  of  this  paper  are  for  synthesized  bottom  profiles  only.  Indeed,  at  this  early  stage 
of  our  investigations  we  are  primarily  interested  in  profiles  having  a  controllable  degree 
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of  roughness.  However,  it  is  naturally  interesting  to  examine  the  degree  of  roughness  that 
is  present  in  real  bathymetric  data.  Therefore  we  have  analyzed  some  of  the  high  quality 
tracklines  that  are  present  in  the  large  database  assembled  by  the  National,  Geophysical 
and  Solar-Terrestrial  Data  Center/NOAA  (NGSDC/NOAA  1977).  The  reader  is  referred 
to  Dworski  and  Holloway  (1983)  for  a  statistical  study  of  this  data. 

We  present  a  sample  calculation  here.  The  data  came  from  a  cruise  by  the  IVV 
Melville  II  hrom  Adak  to  Tokyo  in  October  1973.  Bathymetry  data  at  the  stan  and 
the  end  of  cruise  were  ignored  until  a  reading  of  5000  meters  was  encountered.  Some 
2408  depths  were  recorded  corresponding  to  about  one  reading  every  1.75  kilometers 
along  the  track.  On  the  Mercator  map  the  trackline  is  approximately  a  straight  line 
starting  at  (ITS^W,  52“ N)  near  Adak  in  the  North  Pacific  and  proceeding  south  west 
to(143“E,  35“ N)  west  of  Tokyo.  Figure  (4)  shows  the  total  depth  profile. 


Figure  4  Bathymetric  data  from  the  cruise  of  the  R/V  Melville  n  from  Adak  to  Tokyo  in  October  1973. 


Looking  at  figure  (4)  it  is  clear  that  the  data  are  rougher  in  some  sections  than  in 
others.  In  the  following  table  we  present  some  thermodynamic  characteristics  for  the 
curve  as  a  whole  and  then  separately  for  four  1000  kilometer  sections  along  its  length. 
We  note  that  the  thermodynamic  statistics  are  all  perfectly  well  defined  for  sections  of 
the  curve — ^in  the  future  we  expect  to  make  use  of  this  trait  to  focus  our  computational 
energies  on  those  parts  of  the  boundaries  that  are  likely  to  provide  the  greatest  challenge 
fm:  flow  simulations.  The  fact  that  the  theory  is  well  posed  for  even  the  crudest  of  data 
sets  makes  it  a  useful  diagnostic  for  adaptive  computations. 
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Section 

Number  of 
data  points 

Number 
eliminated  by 
interior  check 

U  =  2\r\/\K\ 

Temperature  r 

Full  track 

2408 

1726 

1.0007 

0.1378 

1000-2000  km. 

520 

457 

1.0017 

0.1566 

2000-3000  km. 

544 

495 

1.0009 

0.1416 

3000-4000  km. 

575 

295 

1.0001 

0.1033 

4000-5000  km. 

564 

260 

1.0004 

0.1271 

In  the  table  we  report  on  the  number  of  points  that  were  present  in  the  experimental 
data  and  also  the  number  of  those  that  were  eliminated  by  the  interior  check  procedure 
before  the  convex  hull  routine  was  called.  On  the  average,  some  70%  of  the  data  points 
were  eliminated  by  this  check  and  in  fact  the  calculations  could  easily  be  carried  out 
in  near  real  time  on  a  moderate  workstation  or  personal  computer.  We  note  that  the 
temperature  of  the  first  1000  kilometer  stretch  is  the  largest  which  corresponds  well 
with  our  intuitive  sense  that  data  sections  that  are  visually  ‘‘roughest”  should  give  rise 
to  larger  temperatures. 


4.  LINEAR  TOPOGRAPHIC  ROSSBY  WAVES 


In  this  section  we  consider  perturbations  of  the  rest  state  for  a  rotating  inviscid 
ocean  of  variable  depth.  The  motions  to  be  considered  will  be  characterized  as  having  a 
large  horizontal  extent  when  compared  to  the  maximum  water  depth  and  therefore  use 
is  made  of  the  hydrostatic  approximation.  The  curvature  of  the  earth  is  ignored  and  we 
will  denote  by  /  the  local  vertical  component  of  the  earth’s  rotation  vector — ^the  Coriolis 
parameter  which  we  take  to  be  a  constant. 
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The  equations  linearized  about  the  rest  state  are  (cf.  LeBlond  and  Mysak(1978)) 

ut-  fv  =  -gTfx, 

Vi +  fu  =  -gr)y,  (20) 

rjt  +  (hu)^  +  (hv)^  =  0, 

where  u,t;  are  the  perturbation  velocity  components  in  the  x,y  directions,  h{x,y) 
measures  the  undisturbed  water  depth  and  t/(x,  y,  t)  measures  the  departure  of  the  free 
surface  from  the  rest  state. 

These  are  easily  reduced  to  the  following  set 

di[(a„  +  /)>)  -  jV  ■  (AV,)]  -  =  0, 

(a„  +  f)u  = -g(d.,  + /d,)g.  (21) 

(3ii  +  =  -s{a,t  -  M)'), 


where  J(h,rj)  =  —  hy^x- 

Even  in  the  case  of  a  flat  ocean  floor  when  h{x,y)  =  constant,  the  equations  admit 
wave  solutions.  These  gravity  waves  have  relatively  short  periods  which  an  <  Iff  and 
are  not  of  interest  in  the  current  study.  It  is  convenient  to  eliminate  them  from  the  start 
and  to  concentrate  on  the  longer  period  waves  that  are  only  seen  in  the  presence  of  a 
non-trivial  topography.  A  scaling  analysis  shows  that  for  motions  with  long  temporal 
periods  (typically  SO  or  more  days)  the  du  terms  in  (21)  are  negligible.  Moreover, 
fcH*  the  periodic  boundary  data  under  consideration,  the  equations  for  u  and  t;  decouple 
entirely  from  the  g  equation  allowing  us  to  concentrate  on 

a,  [{^<t  -  «v  •  (fcVi,)]  -  gJJlh, ,)  =  0.  (22) 

If  L  is  a  characteristic  horizontal  length  and  D  is  say  the  maximum  undisturbed 
ocean  depth  we  can  introduce  non-dimensional  (starred)  variables  as  follows 


X  =  y  =  ^y\  h  =  Dh\  r,  =  Dy" , 

t  =  u  =  V  =  ^v". 


2w 


2ir 


and  arrive  at  the  following  equation  for  g* 


(23) 


dr  [pW  -  •  (h*V*g*)]  -  J*(h\g*)  =  0.  (24) 


All  derivatives  are  now  taken  with  respect  to  the  non-dimensional  variables  while 


PT  = 


y/L/2wg  D 

/-»  ’ 


(25) 
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are  small  non-dimensional  time  and  length  ratio  parameters  respectively.  Fimn  now 
on  we  shall  drop  the  stars  and  all  references  will  be  to  the  non-dimensional  equations 
and  variables. 

It  is  our  intent  to  solve  (24)  for  random,  periodic  h{x,y).  Note  that  it  is  the 
derivatives  of  this  random  function  that  are  important  in  the  current  context.  To  render 
the  problem  computationally  tractable  we  will  only  consider  the  case  where  h  =  h(y). 
A  normal  mode  decomposition  of  the  following  form  is  then  employed 

=  (26) 


yielding  the  equation 


a  -I-  Q^pLh)ii  -  piihr}')'^  -I-  ah'rj  =  0.  (27) 

where  the  prime  denotes  the  derivative  with  respect  to  y. 

Properties  of  the  Governing  Equation 

Introducing  new  parameters 

A  =  -alpia,  p  =  p^JpL.  (28) 


equation  (27)  becomes 


+  [>^h\y)  -  {a^h{y)  +  p)]fi  =  0. 


(29) 


which  is  to  be  solved  with  periodic  boundary  data.  In  this  format  the  equation  is 
similar  to  a  periodic  Sturm  Liouville  system  (see  for  example  Birkhoff  and  Rota  (1978)) 
except  fw  the  impcAtant  fact  that  h'{y),  the  coefficient  multiplying  the  eigenvalue,  is  not 
necessarily  positive.  Nevertheless,  many  of  the  results  fw  Sturm-Liouville  systems  still 
apply.  For  example  we  can  ca^'^'y  prove  the  following  orthogonality  theorem. 


Theorem: 

Eigenfunctions  corresponding  to  different  eigenvalues  are  orthogonal  with  respect  to  dh 
i.e.  if  17^^^  and  Tj(^)  are  eigenfunctions  belonging  to  distinct  eigenvalues  A^^)  and  A^^)  then 


2t 

J  h'{y)dy  =  0 

0 


(30) 
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Proof:  Etefine  the  operator  L  by 

iBl  =  ^  k(y)^  -  (o"*(!<)  +  fH-  (31) 

Then 

L  fori  =  1, 2.  (32) 

It  is  easily  verified  directly  that 

Integrating  from  0  to  2ir  substituting  fw  the  operators  on  the  left  hand  side  yields 


(a<‘)  -  A<^))  j  h'{y)  dy  =  h(y) 


Mm 


which  is  zero  due  to  the  periodic  nature  of  the  coefficients  and  the  eigenfunctions.  Hence 
if  the  eigenfunctions  are  distinct  we  have  the  orthogonality  result. 

Other  properties  of  interest  include 

•  Eigenpairs  come  in  conjugates  i.e.  if  (A,  17(2/))  is  an  eigenpair  so  also  is  (A*,^*(y)). 

•  The  eigenvalues  A  and  thus  the  wave  speeds  a  are  purely  imaginary.  Thus  the 
waves  are  not  dissipative. 

•  The  number  of  zeros  in  the  eigenfunctions  increases  as  a  decreases. 

The  latter  result  has  both  physical  and  computational  significance.  As  was  mentioned 
earlier,  the  topographic  waves  of  greatest  impmiance  are  those  with  the  largest 
wavelengths.  In  the  x-direction  this  concern  with  wavelength  causes  us  to  pay  particular 
attention  to  small  values  of  the  wavenumber  parameter  a.  By  the  same  token  we  wish  to 
characterize  the  eigenfunctions  Tj{y)  according  to  the  number  of  oscillations  they  make 
in  the  non-dimensional  interval  y  €  [0,27r[  and  concentrate  on  those  that  have  the  fewest 
oscillations,  and  thus  the  fewest  zeros  in  that  domain.  This  idea  is  depicted  in  figure  (5). 
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Hgure  S  “Long”  and  “short”  wavelength  solutions. 

This  ability  to  label  the  eigenfunctions  is  crucial  in  the  computational  setting  where 
the  bottom  profile  is  modeled  by  a  randcmi  process.  Each  different  realization  of  the 
bottom  topography  yields  a  different  spectrum  of  tr’s.  It  is  only  by  labelling  the  tr’s  by 
the  number  of  zeros  in  matching  ^’s  that  we  can  do  any  sort  of  reasonable  statistical 
analysis  on  the  dispersion  relations.  Essentially  it  allows  us  to  compare  like  with  like 
from  run  to  run. 

The  details  of  the  proofs  of  these  and  other  properties  of  a  mathematical  nature 
will  be  published  later.  We  note  that  in  particular  we  can  deduce  some  asymptotic 
results  for  the  Lyapunov  exponent  an  i  rotation  number  associated  with  this  equation 
when  the  bottom  profile  h{y)  is  a  piecewise  linear  curve  such  that  the  slope  h'{y)  is 
a  “telegraphic”  random  process — formally  this  is  a  stationary  ergodic  Markov  process 
where  the  slope  switches  between  two  states  i+H,  -H)  at  nodes  that  are  exponentially 
distributed  along  the  y  axis. 

Prior  Results 

Most  of  the  previous  work  on  this  problem  was  done  for  deterministic  bottom  profiles. 
In  particular  the  two  profiles  shown  in  the  figure  below  were  investigated  by  a  number 
of  researchers  and  we  mention  a  couple  of  relevant  results  from  those  studies  now. 


n 

^  [lIllIinnTTTMTTTTTTTTTTi^ — 


a 


In  the  case  of  a  small  constant  slope  profile 

h{y)  =  (1  -  ey] 


(35) 
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the  following  quantized  set  of  dispersion  relations  are  easily  found  (see  for  example 
Pedlosky) 

.T.(a)  =  i—  f-j— 4-r-rl  («) 

PL  In*  + 

The  correspcxiding  topographic  waves  propagate  along  the  positive  x-direcdon 

For  the  second  case  in  the  figure,  that  of  a  small  purely  sinusoidal  bottom  profile 
with  period  2T//i 

h{y)  =  I  -  esinny  (37) 

the  governing  equation  is  of  the  Hills  type.  For  small  a  we  get  periodic  solutions 
(periods  2ir/n).  Rhines  and  Bretherton  (1973)  found  that  asymptotically  (j>t  =  0) 

<rn(a)  =  ±i— — ==^==  (38) 

V 

Topographic  waves  propagating  in  both  directions  along  the  corrugations  of  the  bottom 
profile.  This  is  not  surprising  as  the  bottom  slope  varies  periodically  from  positive  to 
negative.  This  result  can  serve  as  a  useful  test  of  the  numerics. 

Numerical  Simulations 

Next  we  turn  to  numerical  simulations  carried  out  for  other  bottom  profiles.  Having 
expressed  everything  is  in  terms  of  nondimensional  coordinates,  we  assume  that  the 
bottom  topography  is  a  periodic  extension  of  the  fundamental  interval  y  €  [0, 2x)  and 
use  the  Fourier  series  expansions 

=  (39) 

—00  — oo 

to  reduce  the  differential  eigen-problem  (27)  to  the  generalized  algebraic  eigen-problem 

(40) 

where 

Ajk  =  K  +  jk)hj.k  -1-  p6jk 
^jk  —  U  ~  ^)^j—k 

It  is  convenient  to  introduce  the  bottcxn  profile  by  the  relation 

%)  =  l-c6(y).  (42) 

In  terms  of  the  Fourier  coefficients  of  this  profile  we  have 

Ajk  =  (a^  +  jk)  bj_t  -€-^{a^+jk  +  p) Sjk 
^jk  —  0  ~  k)bj^k 

The  eigenvalues  produced  from  equation  (40)  depend  on  all  the  parameters  in  the  problem 

A  =  (44) 

and  also  on  the  resolution  chosen  for  the  eigenfunction  and  the  bottom  profile.  The  actual 
values  were  produced  using  the  standard  QZ  algorithm  on  the  generated  A,  B  matrices. 
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5.  SIMULATING  RANDOM  BOUNDARIES 

Most  of  the  results  presented  in  this  paper  are  for  simulated  models  of  a  rough  ocean 
floors.  There  is  of  course  an  element  choice  in  the  way  one  simulates  rough  surfaces. 
Various  methods  are  discussed  by  Oglivy  (1991).  Our  own  choice  was  motivated  by 
both  practical  and  theoretical  considerations: 

•  It  is  natural  to  refer  vertical  distances  to  the  maximum  depth  of  tlw  undisturbed 
ocean,  as  was  done  above  in  the  non-dimensionalization  process.  C^sequently  we 
want  fri  >  0  for  all  k.  This  condition  is  ensured  by  using  an  exponential  distribution. 

•  There  is  strong  evidence  from  experimental  bathymetry  data  that  the  floor  of  the 
ocean  is  non-Gaussian  (see  for  example,  Dworski  and  Holloway  (1983)).  Indeed  our 
simulated  profiles  are  somewhat  reminiscent  of  experimental  measurements. 

•  From  a  theoretical  point  of  view,  it  is  desirable  to  work  with  a  process  for  which  all 
of  the  moments  are  finite  as  is  the  case  for  the  exponential  distribution.  Although 
this  does  not  play  a  major  role  here,  several  theoretical  statistical  results  for  moving 
average  processes  of  the  type  described  below  only  hold  under  the  assumption  that  the 
moments  of  higher  order  exist  (see  fn*  example,  Grenauder  and  Rosenblatt  (1956)). 

The  principal  tool  we  have  used  to  produce  synthetic  bottom  profiles  are  wide-sense, 
discrete-“time”,  stationary  stochastic  processes  where  at  any  point  yt  =  kAy  in  physical 
space  the  bottom  boundary  is  represented  by  the  moving  average 

OO 

KVk)  =  h=  ^  k  =  0,1,2,  ...  (45) 

>=-00 

The  Vj  were  chosen  to  be  independent  random  variables  having  a  common  exponential 
distribution  function  so  that  the  probability  that  Vj  is  less  than  v  is  given  by 


P{Vj  <v)  =  l-  e-^  (46) 

The  infinite  sums  must  be  truncated  fw  computations  and  in  this  paper  we  take  as  weights 

fori  =  0,l,...,W-l, 

^  \  0  otherwise 

where  W  is  the  averaging  width.  Thus  a  set  of  points  were  generated  according 
to  the  prescription 


k+W-l 

=  0.1.  •  •  •  1^6-1- 


j=k 


(48) 


The  Fourier  transform  of  these  values  then  gives  the  coefficients  that  are  used  to 
produce  the  A  and  B  matrices  above. 
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Note  that  in  practice  the  quantities  of  interest  are  which  are  also  exponentially 
distributed,  but  with  parameter  1/e  so  that 

P[(.Vj  <  v)  =  <  ^)  =  1  -  (49) 

Then  the  mean  and  variance  for  the  corresponding  ebt's  are 

E{th)  =  t,  Var{e6t}  =  £2  (50) 


while  the  correlations  are  given  by 


Cor(6j,6t) 


=l; 


-  (i  -  k)IW  if  \j  -k\<W, 
0  otherwise. 


(51) 


Larger  values  of  the  parameter  e  increase  the  mean  value  of  the  bottom  profile  and 
also  the  deviations  from  that  mean  while  increasing  W  makes  points  on  the  boundary 
more  correlated  and  tends  to  smooth  it  out.  This  is  observed  in  the  figure  (6)  which  shows 
profiles  eb{y)  for  some  different  values  of  c  and  W.  In  each  case  =  256.  The  number 
r  reported  on  each  graph  is  the  geometric  temperature  which  was  discussed  earlier.  We 
point  out  that  larger  values  of  r  are  clearly  associated  with  “wilder”  boundaries. 


W  =  32 


W=16 


t  =  0223 

X  =  0.324 

X  =  0.476 

MVuaAVi/'^ 

e  =  0.05 

X  =  0.190 

X  =  0.228 

X  =  0318 

0.3 

X  =  0.144 

X  =  0.179 

X  =  0.250 

=  0.025 

0.0  271 

Figure  6  Some  realizations  of  bottom  profiles  for  different  values  of  r  and  W  with 
Nb  =  256.  The  scale  on  each  is  identical  to  that  shown  on  the  lowest  left  graph. 
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6  DISPERSION  RELATION  RESULTS 


In  this  section  we  take  up  the  problem  of  numerically  solving  the  algebraic  eigenvalue 
problem  (40).  Note  that  the  input  function  b(y)  is  real  valued  and  thus  bj  =  b*_j  where 
the  superscript  star  denotes  the  complex  conjugate.  Using  this,  it  is  easy  to  show  that 
the  matrix  A  is  Hermidan  while  the  matrix  B  is  skew  Hermidan.  Actually,  fcx  the 
results  presented  in  this  section,  we  also  assumed  that  the  botttxn  profile  is  symmetric, 
b{y)  =  b(-y).  The  matrices  are  then  real  which  simplifies  the  numerical  calculation 
of  the  eigenvalues  somewhat. 

The  eigenvalues  can  be  considered  as  functions  of  all  the  parameters  in  the  problem, 
A  =  A^a,  e,p,  The  bj  depends  the  floor  data  fc*  =  b{yic)  which  in  turn  are 

determined  by  me  averaging  width  W  described  in  the  previous  section.  In  real  long 
wave  flows  the  parameter  p  is  tiny  and  we  have  taken  it  to  be  zero  in  all  our  simulations. 
Therefore  A  =  A(Qr,  c,  W). 

There  are  numerical  resolution  parameters  to  be  considered  also — how  many  Fourier 
modes,  or  equivalently  how  many  points  in  physical  space,  are  used  to  resolve  the 
bottom  boundary  and  the  disturbance  tj{y)l  The  point  of  view  we  have  taken  is  that  if 
Ni,  modes  are  used  for  6(y)  then  one  should  increase  the  number  of  modes  used  for  rj(y) 
until  convergence  is  seen.  In  our  study  several  such  resolution  studies  were  performed. 
To  capture  the  “longest”  mode  (the  y{y)  with  the  fewest  zeros)  it  was  found  that  using 
an  expansion  with  for  Tj{y)  with  Ni,  modes  was  always  more  than  adequate,  lypical 
values  of  Nf,  were  128,256,  and  512.  Note  that  extracting  the  eigenvalues  is  0{N^) 
process  so  going  to  higher  resolutions  is  prohibitively  expensive. 

For  each  choice  of  the  parameters  e  and  W  one  can  generate  many  realizations 
of  a  bottom  topography,  each  of  which  has  approximately  the  same  thermodynamic 
properties.  Figure  (7)  shows  how  the  temperature  of  the  bottom  profile  changes  f<H* 
twenty  different  realizations  each  for  c  =  0.025, 0.050, 0.075  and  W  =  4.  In  practice 
each  trial  corresponds  to  choosing  a  fresh  seed  for  a  random  number  generator.  It  is 
clear  from  the  plot  that  increasing  t  guarantees  an  increase  in  r  although  there  is  also 
some  variability  from  realization  to  realization. 


Temp.  X 


4 1 6  FITZMAURICE,  WO YCZ YNSKI  AND  ODULO 


0.6 


0.3 

\ 

0.0 

0  2  4  6  8  10  12  14  16  18  20 

Realization  Number 

Figure  7  The  temperature  of  twenty  different  realizations  of 
bottom  profiles  for  three  different  values  of  t.  In  each  case  W=4. 

Although  for  fixed  values  of  e  and  W  the  curve  temperature  remains  fairly  close  to 
some  constant  value  there  can  be  quite  a  range  for  the  curve  ordinates.  This  is  depicted 
in  (8)  which  shows  the  mean,  and  the  upper  and  lower  bounds  found  for  cb{yk)  over 
20  realizations,  each  with  e  =  0.05,  W  =  4.  Also  clearly  visible  in  this  plot  is  the 
symmetry  assumption  mentioned  earlier.  That  assumption  will  be  removed  in  a  later 
paper.  Also  note  the  mean  profile  has  eb{y)  «  c  as  we  would  expect. 
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For  each  individual  bottom  profile  we  look  at  a  range  of  wavenumbers  a,  fill  the 
matrices  A,  B  and  solve  the  eigenvalue  problem.  The  eigenvalues  are  sorted  according 
to  their  size  and  the  largest  ones  are  output — ^we  already  know  that  the  corresponding 
eigenfunction  will  have  the  fewest  zeros  and  thus  correspond  to  the  largest  wave.  We 
then  can  make  a  plot  of  the  dispersion  relation  which  shows  the  wave  speed  <7  as  a 
function  of  the  z-wavenumber  a. 

Large  numbers  of  eigenvalue  problems  are  tackled  in  this  process.  The  total  number 
can  be  expressed  as  NwNfNrNa  where  the  four  N’s  respectively  represent  the  number 
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Figure  8  Average  and  extreme  values  of  the  bottom  profile  for  e=O.OS,  W==4. 
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of  averaging  widths  tried,  the  number  of  c’s  used,  the  number  of  realizations  generated, 
and  the  number  of  axial  wavenumbers  investigated  per  realization.  Many  of  these  runs 
are  independent  so  if  a  distributed  network  of  workstations  is  available  they  can  be  used 
with  good  effect  to  reduce  the  computational  burden. 

In  figure  (9)  we  show  the  mean  value  of  the  dispersion  curve  (for  the  longest  wave) 
found  for  three  different  values  of  t.  In  each  case  W  =  A  and  runs  were  done  for  20 
realizations  of  the  bottom  topography.  Also  reported  on  the  graph  is  the  mean  value  of 
the  temperature  of  the  bottom  in  each  of  the  cases.  Clearly  as  the  temperature  rises  so 
also  does  the  mean  value  of  the  wave  speed  a. 

A  more  detailed  statistical  study  of  the  variation  of  a  with  t  will  the  subject  of 
another  paper.  However,  it  is  clear  that  there  is  a  correlation  between  the  roughness 
parameter  and  the  predicted  wave  speed  of  long  waves. 

Of  course  there  is  also  some  variability  in  the  computed  dispersion  relations  for 
different  realizations  at  a  fixed  value  of  t.  In  figure  (10)  plots  are  shown  of  the 
minimum  and  maximum  eigenvalues  found  across  all  the  bottom  profiles  run.  This 
is  done  for  c  =  0.025  and  0.050  corresponding  to  profiles  with  temperatures  close  to 
T  =  0.24  and  r  =  0.34  respectively.  The  region  enclosed  by  minimum  and  maximum 
plots  on  the  left  is  clearly  different  from  that  enclosed  on  the  right. 


Wavenumber  a 


Figure  9  The  dispersion  relation  for  the  longest  wave  averaged  over 
20  different  realizations  with  the  same  r.  W-=A  in  each  case. 


Nondimensional  a 
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e=0.02S  z=0.050 


Nondimensional  wavenumber  a 

Figure  10  The  range  of  values  found  for  the  eigenvalues  over  all  the  runs  for  two  different  values  of  e. 


7.  SUMMARY 

We  have  presented  a  method  for  quantifying  roughness  of  curves  and  surfaces  that 
is  easily  implemented  for  experimental  data  sets.  In  contrast  to  fractal  analyses,  the 
technique  is  based  on  probabilistic  concepts  that  validly  apply  to  dita  sets  of  finite 
resolution.  Intuition  as  to  the  meaning  of  the  temperature  of  a  :urve  was  developed  by 
means  of  simple  examples,  and  efficient  algorithms  and  implementations  were  discussed 
for  more  realistic  data. 

While  there  is  no  doubt  that  geometric  thermodynamics  is  a  useful  identification 
tool  it  is  an  open  question  as  to  whether  it  can  be  used  in  a  predictive  fashion  in 
oceanography.  To  that  end  we  are  currently  studying  the  testbed  problem  of  topographic 
waves  in  an  ocean  of  random  depth  and  have  presented  some  early  results  in  this  paper 
for  synthesized  bottom  profiles.  It  will  be  interesting  to  make  use  of  real  bathymetric 
data  in  these  simulations  also. 

The  coefficients  in  the  governing  equations  are  the  deiivatives  of  random  functions 
and  are  therefore  not  necessarily  small.  One  question  that  we  are  now  investigating 
is  whether  it  possible  to  replace  a  complex  boundary  with  a  much  simpler  one  having 
the  same  “average”  slope  where  that  quantity  is  proportional  to  U  =  2|r|/(5K|. 
Unfortunately,  as  was  mentioned  at  the  end  of  section  2,  U  by  itself  is  insensitive  to 
some  features  of  a  boundary  that  we  would  not  expect  the  flow  to  be  insensitive  to. 
Woric  on  this  and  other  points  is  ongoing  and  we  will  present  a  more  detailed  analysis 
in  a  future  paper. 
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ABSTRACT 

In  this  paper,  we  develop  grid-scale  dependent  parameters,  including  eddy  vis¬ 
cosity  and  eddy  for  non-eddy-resolving  simulations  of  /9-plane  turbulence.  These 
eddy  parameters  account  for  the  effect  of  subgrid  scale  (SGS)  turbulence  and  Rossby 
waves  on  the  resolved  scales  of  motion  and  are  derived  in  a  self-consistent  frame¬ 
work  provided  by  the  renormalization  group  (RG)  theory  of  turbulence.  The  RG 
formalism  allows  a  coarsened  description  of  a  strongly  non-linear  system  with  a  very 
large  number  of  degrees  of  freedom,  or  wave  numbers  in  Fourier  space,  by  succes¬ 
sive  elimination  of  small  shells  of  wave  numbers  corresponding  to  unresolved  scales. 
The  resulting  equation  of  motion  for  large,  resolved  scales  is  structurally  similar  to 
the  initial  equation  but  its  dimensional  parameters,  viscosity  and  /9,  are  rescaled, 
or  renormalized,  and  depend  on  the  wave  number.  In  the  resulting  description  of 
/9-plane  turbulence,  the  flow  fleld  at  relatively  small  scales,  or  large  wave  numbers 
k,  hats  behavior  typical  of  2-D  turbulence;  however,  as  fc  — »  0,  the  /9-effect  becomes 
significant  and  the  flow  characteristics  develop  strong  anisotropy.  The  energy  trauis- 
fer,  energy  spectra  and  two-parametric  viscosity  and  /9  are  calculated  in  the  energy 
sub-range.  At  relatively  large  k  the  energy  spectrum  is  isotropic  and  follows  the  Kol¬ 
mogorov  (-|)  law;  with  decreasing  k  the  spectrum  becomes  substantially  anisotropic 
as  the  energy  is  preferentially  transported  into  zonal  flows  and  zonally  propagating 
Rossby  waves  that  develop  a  power  law  with  exponent  approximately  in  the  zonal 
direction.  We  conclude  that  the  anisotropization  of  the  energy  transfer  is  associated 
with  the  mechanism  of  generation  and  maintenance  of  mean  zonal  flows  and  radi¬ 
ation  of  zonally  propagating  Rossby  waves  by  non-linear  interactions.  It  is  argued 
that  the  two-parametric  viscosity  and  0  should  be  used  as  eddy  viscosity  and  eddy 


421 


422 


GALPERIN  ET  AL 


^  in  non-eddy-resolving  simulations  of  /3-plane  turbulence.  In  physical  space,  the 
large-scale  dynamics  is  described  by  a  Kuramoto-Sivashinsky-type  equation,  which 
includes  a  negative  (destabilizing)  Laplacian,  positive  (stabilizing)  biharmonic  fric¬ 
tion  term,  and,  possibly,  higher  order  hyperviscosities.  Being  numericzdly  stable, 
this  equation  naturally  incorporates  negative  viscosity  phenomena.  The  results  and 
their  implications  in  the  context  of  non-eddy- resolving  modeling  in  geophysical  fluid 
dynamics  are  discussed. 


1.  INTRODUCTION 

The  rapid  development  of  computer  technology  has  enabled  ever  increasing  res¬ 
olution  in  ocean  circulation  models.  At  the  present  time,  there  exist  oceemic  general 
circulation  models  (OGCM)  with  grids  as  small  as  j  that  are  capable  of  resolving 
mesoscale  eddies,  i.e.,  processes  at  the  scales  of  the  local  deformation  radius.  Such 
eddies  play  a  key  role  in  transport  of  vorticity,  mass,  salt,  and  heat  in  the  world 
ocean.  The  feasibility  of  eddy- resolving  modeling  of  the  global  ocean  circulation 
has  been  demonstrated  by  Semtner  and  Chervin  (1988).  Recently,  they  presented 
results  of  extensive  eddy-resolving  simulations  of  the  world  ocean  executed  on  the 
largest  supercomputer  available  at  that  time  (Semtner  and  Chervin,  1992).  They 
found  that  even  with  marginal  resolution  at  high  latitudes,  the  simulated  three- 
dimensional  fields  and  major  features  of  the  global  circulation,  such  as  western 
boundeiry  currents  and  the  Antarctic  Circumpolar  Current  were  quite  realistic.  On 
the  other  hand,  other  key  features  of  the  global  circulation,  such  as  separation  of 
the  Gulf  Stream  2ire  still  not  faithfully  captured  by  present  eddy-resolving  models 
(Haidvogel  et  al.,  1992).  Semtner  and  Chervin  (1992)  suggest  that  further  im¬ 
provement  in  the  modeling  of  ocean  circulation,  particularly  in  the  areeis  of  high 
variability,  will  be  achieved  with  incre£ised  resolution;  as  the  next  step,  simulations 
with  resolution  of  ^  are  envisioned. 

Eddy-resolving  simulations,  eis  well  as  the  observational  data  summarized  in 
Stammer  and  Boning  (1992)  indicate  that  processes  on  the  scales  of  the  local  defor¬ 
mation  radius  are  crucial  for  ocean  dynamics  and  should  be  adequately  represented. 
Semtner  and  Chervin  (1992),  however,  propose  an  even  more  conservative  resolution 
criterion  that  includes  a  part  of  the  inertial  subrange. 

Consideration  of  the  inertial  subrange  opens  the  possibility  of  a  turbulence- 
bsised  subgrid  scale  (SGS)  parameterization  for  models  of  large-scale  circulation 
(see  the  review  by  Holloway,  1989).  Sensitivity  of  OGCM  predictions  to  the  SGS 
parameterization  has  been  well  documented  (see,  for  instance,  Bry2m,  1987).  Semt¬ 
ner  and  Chervin  (1992)  note  significsint  ch£mges  in  their  results  upon  replacing  the 
classical  Laplacian  friction  by  a  biharmonic  one.  They  also  mention  the  importance 
of  effects  related  to  the  phenomenon  of  “negative  viscosity.”  Among  the  reasons 
why  SGS  parameterization  is  so  important  are,  first,  that  a  considerable  amount  of 
energy  is  concentrated  in  subgrid  scales  (see,  for  instance,  Holloway,  1992,  who  esti¬ 
mated  that  the  energy  flux  due  to  subgrid  topographic  effects  on  scales  of  the  order 
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300  m  is  comparable  in  magnitude  with  those  due  to  other  sources),  and,  second, 
that  the  inverse  energy  transfer  developing  in  large-scale,  quasi-two-dimensional, 
turbulent  flows  facilitates  an  efficient  energy  exchange  between  SGS  and  resolved 
eddies.  The  inverse  energy  transfer  arising  from  nonlinear  interactions  is  intimately 
related  to  negative  viscosity  phenomena  (Kraichnan,  1976).  On  the  other  hand, 
existing  models  of  the  large-scale  circulation  parameterize  the  SGS  effects  by  highly 
dissipative  operators  that  are  designed  to  dissipate  the  large-scale  energy  and  en¬ 
sure  numerical  stability,  but  cannot  properly  account  for  the  complex  interaction 
between  resolved  and  unresolved  modes. 

The  problem  of  SGS  parameterization  is  one  of  the  hardest  in  geophysical 
modeling  yet  it  cannot  be  resolved  by  a  mere  increase  in  resolution  which  quickly 
becomes  computationally  prohibitive  even  for  the  fastest  supercomputers.  This 
problem  becomes  even  more  acute  in  climate  models  that  operate  on  very  long  time 
scales;  in  these  models,  any  increase  in  resolution  must  come  at  the  expense  of 
other,  possibly  more  important  information  (see  the  discussions  in  Holloway,  1992, 
and  elsewhere  in  this  volume).  Therefore,  along  with  further  development  of  eddy 
resolving  models,  one  needs  to  invest  more  effort  in  the  better  representation  of 
SGS  processes.  The  latter  line  of  research  should  not  only  improve  the  perfor¬ 
mance  of  eddy  resolving  models  but  should  also  allow  one  to  develop  a  generation 
of  non-eddy-resolving  models  in  which  SGS  parameterization  extends  up  to,  and 
possibly  beyond,  the  scales  of  the  local  deformation  radius  thus  relaxing  resolu¬ 
tion  requirements.  If  such  models  incorporate  a  computationally  efficient  algorithm 
for  the  cedculation  of  SGS  parameters,  they  should  become  a  valuable  resource  for 
modeling  large-scale  ocean  and  atmosphere  circulations,  coupled  atmosphere-ocean 
systems,  and  climate.  A  non-eddy-resolving  model  of  this  kind,  based  upon  the 
renormalization  group  theory  of  turbulence,  is  described  in  the  present  paper. 

To  account  for  non-local  interactions  typiced  of  2D  turbulence,  it  is  convenient 
to  operate  in  Fourier  space.  However,  spectral  closure  methods,  already  compli¬ 
cated  in  the  case  of  purely  2D  turbulence,  quickly  become  intractable  when  spectral 
anisotropy  and/or  waves  are  added  to  the  picture  (for  a  review  of  spectral  closures, 
see  Orszag,  1977;  Lesieur,  1990;  Herring  emd  Kerr,  1993).  To  circumvent  some  of 
these  problems,  the  ideas  of  thermal  equilibrium  statistical  mechanics  have  been 
utilized  by  some  aidhors  (Salmon  et  al.,  1976;  Holloway,  1992,  1993;  Griffa  and 
Salmon,  1989;  Griffa  and  Castellari,  1991).  According  to  these  ideas,  a  non-linear, 
dissipative  and  forced  real  system  is  replaced  by  a  non-dissipative  and  unforced  sys¬ 
tem  that  is  allowed  to  reach  thermal  equilibrium,  i.e.,  a  state  of  maximum  entropy. 
In  this  state,  the  system  characteristics  can  be  calculated  using  variational  princi¬ 
ples  (Salmon  et  al.,  1976;  Robert  and  Sommeria,  1991a,b,  1992).  Although  the  state 
of  thermal  equilibrium  is  never  achievable  in  reality,  the  tendency  toward  this  state 
governs  the  evolution  of  real  systems  (Rose  smd  Sulem,  1978;  Kraichnan  and  Mont¬ 
gomery,  1980).  The  practical  use  of  the  thermal  equilibrium  approach  for  purposes 
of  large-scale  oceanographic  modeling  weis  outlined  and  implemented  by  Holloway 
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(1992,  1993).  He  ceJculated  the  barotropic  thermal  equilibrium  stream  function 
that  reflects  topographic  effects  based  upon  the  variational  principle  of  maximum 
entropy,  and  then  required  that  the  barotropic  stream  function  csdculated  by  a  fully 
forced  OGCM  relax  towards  the  thermal  equilibrium  value.  Such  an  approach  im¬ 
proved  the  representation  of  topographic  effects  in  the  GFDL  OGCM  (see  articles 
by  Holloway  and  Eby  and  Holloway  in  this  volume). 

In  this  paper,  an  application  of  a  different  statistical  mechanical  approach  to 
oceanographic  modeling  is  described.  This  approach  is  formulated  for  a  fully  non¬ 
linear,  forced  and  dissipative  system  and  is  based  upon  the  (renormalization  group) 
RG  theory.  The  RG  theory  has  been  particularly  successful  in  the  description  of 
large-scale,  long-time  behavior  of  systems  associated  with  phase  transitions  and 
critical  phenomena  (see  Wilson  and  Kogut,  1974;  Ma,  1976;  Amit,  1978).  Applied 
to  a  strongly  non-linear  system  with  a  very  large  number  of  degrees  of  freedom,  the 
RG  theory  allows  one  to  “coarsen”  the  description  of  the  system  by  “mapping”  it 
onto  a  system  with  a  significantly  reduced  number  of  modes.  This  reduced  system 
is  described  by  an  equation  structurally  identical  to  the  one  describing  the  initial 
system,  but  its  dimensional  parameters  (such  as  viscosity)  are  renormalized,  or 
multiplicatively  rescaled  in  terms  of  the  scales  being  removed.  The  final  product 
of  this  approach  resembles  the  traditional  eddy  viscosity  parameterization,  but  the 
eddy  parameters  emerging  from  it  are  calculated  by  a  self-consistent  algorithm  based 
upon  the  physics  of  the  problem. 

The  RG  techniques  were  first  applied  to  artificially  forced  fluid  turbulence  (see 
Forster  et  al.,  1977),  and  then  extended  to  realistic  3D  turbulence  (Yakhot  and 
Orszag,  1986).  Since  then,  the  RG  methods  have  been  widely  used  in  a  variety  of 
applications,  including  simulations  of  transitional,  incompressible  and  compressible 
flows,  turbulent  combustion  and  derivation  of  k  —  €  turbulence  transport  models. 
A  review  of  various  applications  of  the  RG  methods  can  be  found  in  Galperin 
and  Orszag  (1993).  In  aneilogy  to  its  applications  in  theoretical  physics,  the  RG 
technique  for  turbulence  sJlows  the  study  of  the  large-scale,  long-time  behavior  of 
turbulent  flow  fields  and  their  correlation  functions.  In  addition,  the  RG  formalism 
can  be  used  for  the  derivation  of  SGS  models  in  both  3D  and  2D  flows.  The  RG- 
based  spectral  closures  lire  generally  simpler  than  those  obtained  in  other  theories, 
which  is  a  promising  feature  for  efficient  SGS  parameterization.  The  present  paper 
describes  one  of  the  first  applications  of  the  RG  technique  to  geophysical  flows  and 
considers  the  large-scale,  long-time  behavior  of  j3-plane  turbulence,  as  well  as  its 
SGS  parameterization.  The  flow  chosen  is  one  of  the  simplest  “building  block” 
geophysicjd  flows  and  has  been  quite  well  studied  theoretically  and  numerically. 
Being  relatively  simple,  it  combines  features  which  have  hindered  the  application  of 
spectrjJ  closures  to  such  kinds  of  flows  in  the  past:  spectral  anisotropy  and  Rossby 
waves.  The  methodology  of  RG  not  only  allows  one  to  advance  the  analytical 
understanding  of  this  flow,  but  provides  an  efficient  SGS  parameterization  that 
can  be  used  in  large  eddy  simulations  and  non-eddy-resolving  modeling  of  /S-plane 
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tiirbulence. 

In  the  next  section,  we  provide  the  mathematical  formulation  of  the  problem 
based  upon  the  barotropic  vorticity  equation  on  the  /9-plane  in  physical  and  Fourier 
space.  In  Section  3,  we  describe  the  application  of  the  RG  procedure  and  show  how 
the  process  of  small-scale  elimination  produces  rescaled  SGS  parameters.  In  Section 
4,  an  analysis  of  characteristic  time  scales  is  given  and  the  regions  dominated  by 
2D  turbulence  or  Rossby  waves  are  identified.  Also,  anisotropic  energy  spjectra  are 
calculated  and  analyzed.  In  Section  5,  the  SGS  (or  eddy)  parameters  are  intro¬ 
duced  based  upon  two-point  turbulence  characteristics.  It  is  shown  how  the  eddy 
parameters  can  be  derived  using  the  RG  theory.  Then,  in  Section  6,  we  describe 
the  RG-calculated  anisotropic  spectral  energy  transfer  and  relate  it  to  existing  in¬ 
formation  on  /9-plane  turbulence.  In  Section  7,  we  show  how  the  RG-based  eddy 
parameters  can  be  used  in  practical  oceanographic  simulations.  Finally,  in  Section 
8  we  summarize  the  results  obtained  here. 

2.  MATHEMATICAL  FORMULATION  OF  THE  ^-PLANE  PROBLEM 
AND  RESULTS  OF  PREVIOUS  RESEARCH 

The  subject  of  the  present  paper  is  the  barotropic  vorticity  equation  on  the 
/9-plane: 


ac  3(v-^c.c) 

dt  d{x,y) 


(1) 


where  (  is  the  barotropic  vorticity  and  Uo  is  the  molecular  viscosity;  x  and  y  are 
directed  eastward  and  northward,  respectively.  The  constant  fio  is  the  background 
vorticity  gradient;  it  describes  the  latitudineil  variation  of  the  verticeil  component 
of  the  Coriolis  parameter,  /,  in  the  /9-plane  approximation,  /  =  /o  +  ^oV- 


This  equation  describes  one  of  the  simplest  “building  block”  systems  relevant 
to  geophysical  flows  (see,  e.g.,  Pedlosky,  1987).  For  /9o=0,  this  equation  describes 
isotropic  2D  turbulence  (see  the  reviews  by  Kraichnan  and  Montgomery,  1980;  Val- 
lis,  1992).  In  its  linearized  form,  it  describes  Rossby  waves  with  the  dispersion 
relation 


u;  =  -0okxfk^  +  (2) 

With  the  full  non-linear  terms  and  /9o  ^  0,  (1)  describes  the  interaction  between 
2D  turbulence  and  Rossby  waves. 

Studies  of  (1)  from  the  point  of  view  of  a  non-linear  system  combining  turbu¬ 
lence  and  waves  were  first  reported  by  Rhines  (1975).  He  foimd  that,  at  large  k,  the 
/9-efFect  is  small  and  the  flow  behaves  largely  like  2D  turbulence.  With  decreasing 
k,  the  inverse  energy  cascade  terminates  and  the  flow  evolves  towards  the  regime 
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of  linear  Rossby  waves.  The  transition  from  turbulence  to  the  Rossby  waves  dom¬ 
inated  regime  takes  place  at  wave  numbers  of  the  order  of  where 

[/  is  a  measure  of  velocity  fluctuations  in  the  system.  Maltrud  and  Vallis  (1991) 
and  Vallis  and  Maltrud  (1993)  expressed  k^  in  terms  of  the  inverse  energy  cascade 
rate  c:  k^  =  (/J^/?)*/^.  Rhines  (1975)  noted  that  as  k^  is  approached,  the  energy 
transfer  not  only  slows  down,  but  also  becomes  progressively  anisotropic  preferring 
the  zonal  direction.  Later,  Holloway  and  Hendershott  (1977)  extended  studies  of 
/3-plane  turbulence  by  the  use  of  the  test  field  model  (TFM)  of  Kraichnan  (1971),  a 
spectral  closure  theory.  They  reconfirmed  the  observations  of  Rhines  (1975)  about 
the  slowing  down  of  the  inverse  energy  transfer  and  its  anisotropization  with  a  pre¬ 
ferred  zonal  direction.  They  redefined  the  transitional  wave  number  k^  in  terms  of 
the  root  mean  square  vorticity,  such  that  Galilean  invariance  could  be  satisfied 
automatically;  in  their  notation,  k/j  =  /3/C.  Further  theoretical  accounts  can  be 
found  in  Carnevale  and  Martin  (1982),  Salmon  (1982),  Maltrud  and  Vallis  (1991), 
and  Vallis  and  Maltrud  (1993).  In  particular,  Vallis  and  Maltrud  (1993)  noted  that 
kff  is  an  anisotropic  parameter,  and  derived  an  analytical  expression  describing  its 
angular  variation.  This  issue  will  be  revisited  in  Section  4.  Bartello  and  Holloway 
(1991)  used  the  TFM  framework  for  analytical  and  numerical  studies  of  diffusion 
on  the  /?-plane.  Holloway  (1986)  provided  a  comprehensive  review  of  the  /3-plane 
reseeu'ch. 

Turbulence  on  the  /3-plane  has  been  extensively  studied  by  numerical  experi¬ 
mentation.  In  all  simulations,  robust  generation  of  zonal  flows  has  been  observed, 
either  in  Cartesian  (Rhines,  1975;  Maltrud  and  Vallis,  1991;  Vallis  and  Maltrud 
1993;  Bartello  and  Holloway,  1991;  Panetta,  1993)  or  spherical  (Williams,  1978; 
Yoden  and  Yamada,  1993)  coordinate  systems.  Maltrud  and  Vallis  (1991)  observed 
that  on  the  /3-plane,  large-scale  vortices  tend  to  radiate  their  energy  as  Rossby 
waves,  a  phenomenon  that  improves  the  applicability  of  statistical  theories  to  /3- 
plane  turbulence. 

It  should  be  emphsisized  that  in  these  previous  studies,  the  process  of  spectral 
transfer  anisotropization  has  not  been  fully  quantified  or  related  to  SGS  parame¬ 
terization.  Moreover,  large  eddy  simulations  (LES)  of  /3-plane  turbulence,  in  which 
large-scale  modes  of  the  flow  are  resolved,  but  small  scales  are  parameterized,  have 
never  been  attempted.  In  the  present  paper,  we  quantify  the  inverse  energy  trans¬ 
fer,  its  anisotropization  due  to  the  /3-efFect,  and  the  interaction  between  turbulence 
and  Rossby  waves,  and  develop  SGS  models  for  LES  of  /8-plane  turbulence.  This 
effort  can  be  considered  as  a  case  study  of  non-eddy-resolving  modeling,  in  which 
SGS  eddies  are  not  resolved  but  their  effects  are  properly  incorporated.  The  results 
of  this  “building  block”  study  can  be  extended  for  non-eddy-resolving  modeling  of 
more  realistic  oceanic  and  atmospheric  systems. 

The  RG  analysis  is  performed  in  wave  number-frequency  space  for  an  infinite 
domain.  Using  the  space-time  Fourier  transform  of  vorticity,  one  can  derive  the 
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Fourier-space  representation  of  (1): 


{iu}  -^-i^o^xk  * 


^  X  qC(g)C(^’-9) 

92  (27r)-'+» 


dq. 


(3) 


where  d  is  the  dimension  of  space  (^=2  in  this  study),  and  k  and  q  are  three- 
dimensional  vectors  (k,a;)  and  (q,f2),  respectively. 

Here  we  study  a  forced  system  with  the  forcing  concentrated  at  a  high  wave 
number  ko.  According  to  the  classical  results  of  Batchelor  (1969)  and  Kraichnan 
(1967),  conservation  of  energy  and  enstrophy  in  the  inviscid  limit  may  lead  to  the 
development  of  two  inertial  sub- ranges:  the  energy  sub- range  for  k  <  kg,  where 
energy  is  transported  up-scales  and  the  energy  spectrum  is  the  Kolmogorov  E(k)  <x. 
^-5/3  enstrophy  sub- range  for  k  >  ko,  where  enstrophy  is  transported 

down-scales  and  the  energy  spectrum  is  E(k)  oc  k~^.  The  subject  of  the  present 
study  is  the  energy  sub-range.  Owing  to  the  inverse  transfer  mechanism,  energy  is 
being  pumped  into  ever  increasing  scales  of  motion,  so  that  a  global  steady-state 
solution  is  unreachable.  However,  if  the  energy  injection  rate  at  ko  is  constant  in 
time  and  equals  ?,  then  after  the  energy  “front”  sweeps  over  a  wave  number  k,  a 
local  steady  state  will  develop  at  k,  in  which  energy  will  be  passing  through  k  with 
the  constant  rate  ?.  This  local  statistically  steady  state  will  be  analyzed  here.  It 
can  be  shown  (Orszag  et  al.,  1993a)  that  the  effect  of  the  forcing  localized  at  ko  on 
the  initial  equation  of  motion  that  resolves  all  scales  is  equivalent  to  the  effect  of 
spatially-homogeneous  forcing  ^(k,u;)  on  the  “coarsened”  equation  that  results  from 
the  application  of  the  RG  procedure.  The  forcing  ^(k,u;)  is  zero-mean,  Gaussian, 
white  in  time,  and  its  correlation  function  is 


«(k,u>)«k',u>'))  =2D.t-(2)r)''+‘<(k  +  k')%  +  u.'),  (4) 

where  s  and  Do  are  parameters  to  be  specified  later.  If  this  forcing  is  inserted  into 
(1)  explicitly,  after  Fourier- transform  it  results  in  the  following  modification  of  (3): 

<(*)  =  G’Ck) I  (5) 

where  G®(fc)  =  (to;  -|-  i^okxk~^  +  t/ok^)~^  is  the  bare  Green  function. 

Equation  (5)  is  the  subject  of  the  RG  smalysis  given  below  in  which  the  object 
is  to  derive  an  effective  equation  for  large-scale  components  of  This  derivation 
will  be  based  upon  graduedly  eliminating  small-scale  components  of  the  flow  field 
with  characteristic  wave  numbers  of  the  order  of  the  dissipation  cutoff  and  moving 
the  dissipation  cutoff  toward  larger  scales.  In  this  process,  the  initially  constant 
molecular  viscosity  t/o  and  0o  get  modified,  or  renormalized,  and  become  functions 
of  the  dissipation  cutoff. 
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3.  SMALL  SCALE  ELIMINATION  BY  THE  RG  PROCEDURE 

As  formulated  by  Yakhot  and  Orszag  (1986)  for  3D  turbulence  and  adapted 
here  for  2D  turbulence,  the  RG  theory  seeks  to  answer  the  following  question: 
“How  are  the  long  wave  length  modes  C^{k)  belonging  to  the  interval  0  <  fc  <  A 
affected  by  the  short  wave  length  modes  C^{^)  from  a  narrow  wave  vector  band 
A—SA  <  k  <  A?”  The  answer  to  this  question  is  obtained  by  a  formal  RG  procedure. 
Repeated  many  times,  it  allows  one  to  consider  the  limit  A  — *  0,  corresponding  to 
the  large-scale  sisymptotics;  if  A  remains  finite,  only  a  finite  shell  of  short  wave 
length  modes  is  eliminated  leading  to  SGS  parameterization. 

Let  us  assume  that  initially  (5)  is  defined  on  the  interval  0  <  A:  <  Ao,  where  Ao 
is  the  dissipation  cutoff.  Following  the  Yakhot  and  Orszag  { 1986)  RG  procedure  for 
3D  turbulence,  the  formal  RG  procedure  for  2D  turbulence  consists  of  two  steps. 
First,  one  introduces  a  narrow  band  of  wave  vectors  near  the  dissipation  cutoff, 
Ao  —  SAo  <  k  <  Ao,  and  the  vorticity  field  and  stirring  force  are  decomposed  into 
two  parts:  “fast”  modes  C^{k),^^{k)  with  wave  vectors  satisfying  Ao  —  SAo  < 
k  <  Ao,  and  “slow”  modes  C^(^‘)>^^{^*)  fr**"  which  0  <  A:  <  Ao  —  ^Ao-  With  this 
decomposition,  (5)  becomes: 

C(i)  =  WCk)  j  [<^(9)C^(i'  -  9)  +  2C'(9')C^(i  -  9) 

+  ewed-  -  9)]  +  c!°dmh  (6) 

where  Aq  is  the  formal  expansion  parameter  introduced  for  the  purpose  of  devel¬ 
oping  a  perturbative  solution  to  (6);  eventually.  A©  is  set  to  1.  The  perturbative 
solution  of  the  non-dimensionalized  (6)  involves  expansion  in  a  non-dimensional 
coupling  parameter  Aq  =  XqDo'  jvo  Aj  ,  where  e  =  6  -|-  s  —  d.  Note  that  Ao  is 
in  fact  a  “local”  Reynolds  number  determined  by  the  “local”  viscosity  i/(Ao)  at  the 
“local”  wave  number  Ao-  After  repetitive  elimination  of  small  wave  number  bands 
SAo  described  below,  the  local  Reynolds  number  remains  0(1)  because  the  “local” 
viscosity  increases  and  Kolmogorov-like  viscous  scaling  holds.  Thus,  while  the  RG 
procedure  is  based  upon  expansion  in  terms  of  A  =  0(  1 ),  the  procedure  is  likely  to 
yield  much  better  results  than  other  methods  which  employ  expansions  in  terms  of 
the  conventional  bare  Reynolds  number  which  is  very  large  in  real  flows. 

The  “fast”  modes  (,^{k)  are  eliminated  from  (6)  through  recursive  substitution 
of  the  formJil  solution  (5)  written  for  This  yields  a  solution  for  C^(^) 

terms  of  an  infinite  series  in  powers  of  Aq. 

The  second  step  of  the  RG  procedure  consists  of  taking  averages  over  the  short 
wave  length  modes  of  the  stirring  force  The  derivations  are  given  by  Yakhot  and 

Orszag  (1986),  Forster  et  al.  (1977),  and  in  further  detail  by  Smith  and  Reynolds 
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(1992).  The  result  is  a  set  of  effective  dynamical  equations  for  the  slow  modes  with 
0  <  fc  <  Ao  —  This  process  is  then  iterated  to  remove  further  infinitesimal 
bands  of  modes,  resulting  in  a  set  of  ordinary  differentieil  equations  for  i/,  ^  and 
A  as  functions  of  k.  Particularly  important  in  the  RG  theory  are  those  solutions 
for  which  dX{k)/dk  =  0;  a  solution  of  this  kind  is  called  a  fixed  point  (Amit,  1978; 
Creswick  et  al.,  1992).  In  the  case  of  3D  isotropic  turbulence,  Yakhot  jind  Orszag 
(1986)  showed  that  at  the  fixed  point  A  a  For  the  case  of  randomly  stirred 
flows  near  thermal  equilibrium  including  2D  flows,  similar  analysis  was  performed 
by  Forster  et  al.  (1977).  If  «  — »  0,  then  also  A  — ♦  0,  and  the  results  are  exact. 

For  finite  e,  the  so-called  e-expansion  (Wilson  and  Kogut,  1974;  Creswick  et 
al.,  1992)  is  applied  in  which  quantities  are  evaluated  asymptotically  only  to  the 
lowest  nontrivial  order  in  powers  of  c.  If  e  were  equal  to  zero  or  at  least  small 
in  rezJ  turbulence,  the  problem  of  turbulence  would  then  be  solvable  using  expan¬ 
sions  in  powers  of  A  or  c.  Unfortunately,  “real”  values  of  c  are  not  small;  indeed, 
e  =  4  for  3D  turbulence.  Does  this  fact  invalidate  the  c-expansion?  Fortunately, 
the  situation  is  not  so  gloomy,  although  the  mathematical  justification  for  the  e- 
expansion  procedure  for  fluid  turbulence  does  not  yet  exist.  Using  the  c-expansion 
for  3D  turbulence,  Yakhot  and  .jrszag  (1986)  succeeded  in  calculating  many  basic 
constants  of  turbulence,  such  as  the  Kolmogorov  and  Batchelor  constants  from  first 
principles  of  the  RG  theory.  Also,  RG-derived  turbulence  transport  models  have 
achieved  considerable  success  in  calculations  of  complex  flows  in  complex  geome¬ 
tries  (Orszag  et  al.,  1993b).  Applying  the  c-expansion  to  isotropic  2D  turbulence, 
Staroselsky  and  Sukoriansky  (1993)  calculated  the  Kolmogorov  constant  which  was 
in  good  agreement  with  available  data.  They  also  ccilculated  a  two-parametric  vis¬ 
cosity,  which  will  be  discussed  later,  and  found  it  to  be  in  good  agreement  with 
that  calculated  by  Kraichnan  (1976)  using  an  entirely  different  approach,  the  TFM 
closure  theory.  Based  upon  the  previous  success  in  application  of  the  c-expansion 
technique  to  both  3D  and  2D  turbulence,  in  the  present  paper  we  cdso  apply  the 
e-expansion  technique  to  achieve  nontrivial  results  for  emisotropic  turbulence  on  the 
,3-plane;  in  this  procedure,  only  terms  up  to  second  order  in  A  are  kept. 

At  the  first  iteration  of  the  scale  elimination  procedure,  the  equation  for 
becomes 


lG«(ir  ■  C(^)  =  A„  I  ^i:<(q)<;<(k  - 


+  +  Ao  / 


k  X  q 


(27r)«'+i 


G’(q)G-Ck  -  -  «■)) 


dq 


(27r)‘'+i 


+  o(a5), 


(7) 


where  denotes  integration  over  the  band  being  removed.  Equation  (7)  will  be 
analyzed  in  a  small-fc  approximation,  i.e.,  in  the  asymptotic  limit  when  k/Ao  — ♦  0. 
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The  analysis  of  neglected  terms  is  quite  similar  to  the  case  of  the  3D  Navier- Stokes 
equations  considwred  by  Yakhot  and  Orszag  (1986)  and  Forster  et  al.  (1977). 

The  first  term  on  the  right  of  (7)  is  the  usual  non-linear  term  for  the  slow 
modes  while  the  second  term  is  generated  by  non-linear  fast  mode  interactions. 
In  the  limit  k/ Ao  — »  0,  after  performing  a  frequency  integration  over  Q.  and  setting 
u?  — >  0,  this  term  can  be  represented  as  ^[G‘’(k)]“*C^(k),  where 


(5[G‘’(k)]-'  =  k 


-2 


r 


2/i(q)  [^(q)  +  Mk  -  q)] 


[A:V  -  (kq)^l 


d^q 

(2^ 


(8) 


with  /i(q)  =  +  i0oQxQ~^-  Thus  ^[G‘’(k)]"*  is  the  correction  to  the  inverse  Green 

(or  response)  function,  such  that  its  O(fc^)  and  0(fc“’)  terms  describe  corrections 
Svo  and  S^o  to  the  bare  viscosity  Uo  and  bare  fio-,  respectively.  It  is  clear  however 
that  since  the  integral  (8)  is  calculated  as  a  power  series  expansion  in  powers  of 
fc/Ao  in  the  limit  k/ Ao  — »  0,  terms  of  the  form  0(fc~* )  do  not  appear  in  the  result, 
i.e.,  the  /3-term  does  not  renormalize,  S0o  =  and  0o  remains  constant  and  equal 
to  its  bare  value  in  the  process  of  small-scale  elimination.  Keeping  this  in  mind, 
the  subscript  o  will  be  removed  from  0o  in  the  following. 

The  last  term  in  (7)  is  a  correction  to  the  stirring  force.  In  the  formal  limit 

a?  — ►  0  at  the  fixed  point  corresponding  to  c  0  this  term  develops  a  singularity  of 

the  kind  l/(— 2€  —  d  -}-  2),  which  is  specific  to  2D  turbulence  (d  =  2).  The  physics 

of  this  singularity  is  clear:  the  inverse  cascade  cannot  possibly  exist  as  a  truly 

stationary  state  of  a  closed  system  with  no  energy  sink  at  the  largest  scales.  This 

singularity  can  be  removed  by  retaining  finite  but  small  u  in  calculation  of  the 

force  renormalization  integral.  This  is  not  a  rigorous  way  of  singularity  removal 

but  rather  a  use  of  intermediate  asymptotics  corresponding  to  finite  (non-infinite) 

times  or  finite  (non-zero)  frequencies.  Upon  removing  the  singularity,  it  can  be 

_ 2 

shown  that  this  term  introduces  an  0(A  )  correction  to  the  correlation  function  (4) 

_ 4 

which  in  turn  generates  0(A  )  terms  at  the  next  iteration  steps,  such  that  this  term 
can  be  neglected  altogether. 

After  the  integration  over  the  shell  6Ao  is  completed,  the  renormalized  form 
of  equation  (5)  is  obtained.  Structurally,  it  is  the  same  as  the  original  (5),  but 
its  parameters  are  renormalized,  and  it  is  defined  in  the  reduced  interval  of  wave 
numbers  0  <  A:  <  Ao  —  SAo.  Repeating  this  procedure  many  times,  one  can  remove  a 
finite  band  of  modes.  The  largest  remaining  wave  number  will  be  called  the  moving 
dissipation  cutoff  and  denoted  by  kc- 

In  the  fixed  point,  a  fully  renormalized  equation  of  motion  is  obtjiined;  it  reads 


C(fc)  =  GrCmh), 


(9) 
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where  Gr{k)  —  [iu;  +  il3kik~^  +  j/(k)L^)“‘  is  the  renormalized  Green  function.  This 
equation  will  be  used  in  the  next  section  for  calculation  of  characteristic  time  scales, 
correlation  functions  and  spectra. 

Let  us  apply  now  the  RG-scheme  of  small  scale  elimination  to  the  case  of 
isotropic  2D  turbulence  {0  =  0)  (Staroselsky  and  Sukoriansky,  1993).  It  can  be 
shown  that  for  the  energy  sub-range,  s  =  0  and  e  =  4.  Equating  the  total  energy 
transport  through  the  system  to  the  constant  rate  of  energy  injection,  e,  Staroselsky 
and  Sukoriansky  (1993)  established  that  Do  =  64?.  These  parameters  lead  to  the 
classical  Kolmogorov  energy  spectrum  E{k)  =  In  the  enstrophy  sub¬ 

range,  s  =  2,  c  =  6,  Do  oc  rj,  where  rj  is  the  constant  rate  of  enstrophy  dissipation, 
and  the  energy  spectrum  is  E{k)  =  .  In  the  limit  t  — »  0,  a;  — »  0  and  in 

the  lowest  order  of  the  c-expansion  the  recursive  differential  relation  for  v{k)  is 


dv{k)  _  \  Do 

dk  16ff 


(10) 


The  solution  to  (10)  is 


which  indicates  that  in  isotropic  2D  turbulence  u{k)  grows  like  k~*^^. 

Let  us  now  consider  the  general  case  0^0.  Then,  the  integrand  in  (8)  con¬ 
tains  angle- dependent  functions  i/(q)  and  /i(q)  =  i'(q)g^  -I-  i0qzq~^.  In  principle, 
the  renormalized  force  could  also  become  anisotropic,  leading  to  D  =  P(q).  This 
would  add  further  difficulties  to  the  already  difficult  problem  of  force  renormaliza¬ 
tion  discussed  earlier.  The  possible  anisotropization  of  the  renormalized  forcing  will 
be  neglected  in  the  present  study.  On  the  one  hand,  such  an  assumption  seems  to 
be  justified  by  the  fact  that  the  renormalized  Green  function  Grik)  absorbs  con¬ 
siderable  anisotropy;  on  the  other,  some  sensitivity  studies  discussed  later  indicate 
that  the  results  are  not  very  sensitive  to  the  forcing  anisotropy. 

The  anisotropy  due  to  0  ^  0  makes  the  angular  integration  in  (8)  nontrivial, 
so  that  a  self-contained  equation  similar  to  ( 10)  cannot  be  obtained  for  anisotropic 
i/(k).  Instead,  recast  in  terms  of  the  non-dimensional  parameters  M  =  vk^ / 0  and 
2  =  Dok^~* / 0^ ,  i/(k)  is  described  by  the  integro-differential  equation 


dM{2,<i>) 

dz 


9-c 


l_l _ 1_  ^ 

^  2  9  —  e  27r  /o  27r 


F{MA<f>) 

M^{z,d)  ’ 
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Figure  1.  Eddy  damping  parameter  i/(k)  normalized  by  its  max¬ 
imum  value. 


where  <f>  =  a.vcidia{ky / and 


F{M,e,4>) 


1  —  cos^  $ 

2cos(5  -I-  (;^)cos^]  -  6cos^^)}  . 


(13) 


The  analysis  of  (12),  (13)  reveals  that  for  ^  >>  1,  turbulence  is  essentially 
isotropic,  and  1^(14)  <x  k  as  in  pure  2-D  turbulence.  The  anisotropy  induced 
by  the  /^-effect  develops  at  2  =  0(1).  The  results  below  pertain  to  the  case  of 
the  energy  sub-range,  where  s  =  2,  e  =  4,2  =  Dok^//3^,  Do  =  64?,  giving  2  = 
Q4clk^  f =  64(A:/A:;9)®,  where  k^  =  (/3^/?)*/®.  As  will  be  discussed  in  the  next 
section,  k^  is  a  transitional  wave  number  that  separates  regions  of  2D  turbulence 
and  Rossby  wave  domination. 

Solving  (12),  (13)  gives  t'(k)  as  depicted  in  Fig.  1  in  cylindrical  surface  coordi¬ 
nates  as  a  function  of  \i./kc\  kc  here  and  in  Figs.  2,  4  below  is  just  a  dimensional  scale 
that  corresponds  to  2  =  10^,  or  kc  =  (10^/64)*/®*^  =  1.73Jk^.  At  small  fc,  u{k) 
grows  sharply  for  <t>  G  [7r/4,37r/4]  and  [57r/4, 77r/4];  it  abruptly  decreases  to  zero 
along  <^  =  0,  TT  causing  a  singularity  in  the  numericfJ  solution  of  (12)  at  2  «  0.04, 
or  fc  «  O.2Zk0. 
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4.  CHARACTERISTIC  TIME  SCALES  AND  SPECTRA 
IN  /3-PLANE  TURBULENCE 

The  relative  magnitudes  of  the  turbulence  time  scale,  the  eddy  turnover  time 
Ttu  =  and  the  Rossby  wave  period,  tr  =  (/3  cos  <j>/ k)~^ ,  determine  which 

process  dominates.  In  Fig.  2a  we  plot  Tiu/tr  =  cos<^/M(it,^).  At  large  k,  this 
ratio  is  smaller  than  1  and  the  flow  is  turbulence  dominated.  With  decreeising  k, 
the  ratio  becomes  progressively  anisotropic;  it  remains  much  smaller  than  1  for  the 
directions  close  io  4>  =  ±7r/2  but  rapidly  grows  in  the  vicinity  of  =  0  and  tt 
indicating  progressive  domination  of  Rossby  waves.  In  Fig.  2b,  we  plot  only  the 
Rossby  wave  dominated  region  of  Fig.  2a,  i.e.,  only  the  part  where  Tiu/tr  >  1. 
The  base  of  this  surface,  i.e.,  the  curve  at  which  Tt^/rR  =  1,  can  be  identified  as 
the  threshold  between  turbulence  and  Rossby  wave  domination.  Vallis  and  Maltrud 
(1993)  derived  an  analytical  expression  for  such  a  curve  using  scaling  relations  based 
upon  isotropic  2D  turbulence.  For  k^  —  (/3^/c)*/^,  they  found 

kx0  =  k^  cos*/^  <f>,  (14a) 

ky^  =  k0  sin  <l>cos^^^  <l>,  (14b) 

such  that  the  anisotropic  transitional  wave  number,  kt{<i>),  becomes 

ki{(l))  =  (kj^  +  kl^Y^^  =  ki3  (f>.  (15) 

The  curve  given  by  (15),  described  by  Vallis  and  Maltrud  (1993)  as  a  “dumb-bell 
shape,”  is  shown  in  Fig.  3;  it  seems  to  agree  well  with  the  contour  Ti^/tr  =  1  in 
Fig.  2b. 

It  is  important  to  note  that  the  singularity  in  i/(k)  at  ^  =  0.04,  or  fc  =  0.23fc;j 
for  </>  =  0,  TT  mentioned  in  the  previous  section,  resides  well  inside  the  dumb-bell 
shape,  such  that  it  should  not  significantly  affect  turbulence  and  wave-turbulence 
transitional  processes  that  take  place  at  much  leirger  k.  It  is  thus  expected  that  2D 
turbulence- Rossby  waves  interactions  are  captured  faithfully  by  the  RG  model. 

The  preceding  results  indicate  that  as  fc  — ♦  0,  the  ;8-term  significantly  af¬ 
fects  the  nature  of  the  flow  field  making  it  anisotropic  and  either  turbulence-  or 
waves-dominated.  One  should  expect  that  the  energy  spectrum  would  also  develop 
dependence  upon  <f>  and  that  an  anisotropic  spectrum  E(k)  must  be  considered. 

The  energy  spectrum,  i^(k),  is  related  to  the  vorticity  correlation  function 
t/(k,a;)  which  in  turn  is  expressed  in  terms  of  the  correlation  function  of  the  stirring 
force  (4)  using  (9): 


C7(k,a;) 


(C(k,a;)C(k^a;0) 

(27r)‘'+i^(k-|-k')6(u;-ha;') 


2DoJf*Gr(k,a;)Gr(-k,  -u;), 


(16) 
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Figure  2a.  The  ratio  of  the  eddy  turnover  to  Rossby  wave  time 
scales,  TtufTR. 


ve  domination 


Figure  2b.  The  region  of  Rossby  wa’ 


,  tiu/tr  >  1. 
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Figure  3.  The  “dumb-bell  shape”  of  Vallis  and  Maltrud  (1993) 
for  the  anisotropic  transitional  wavenumber  fc<(<^)  given 
by  (15). 


In  Figs.  4a, b,  we  plot  the  compensated  energy  spectra  £^(k)i'®/^  and  F?(k)fc^/^, 
respectively.  One  can  see  that  for  large  k,  ^(k)  is  isotropic  and  proportional  to 
^-5/3^  which  is  the  classical  Kolmogorov  spectrum  found  in  the  energy  sub- range 
of  isotropic  2D  turbulence.  As  A:  0,  the  spectral  anisotropy  develops;  the  results 
plotted  in  Fig.  4b  indicate  that  E{k)  oc  is  a  good  approximation  for  <f>  = 

0,  TT.  One  could  speculate  that  this  spectrum  is  generated  by  interacting  non-linear 
Rossby  waves,  which  is  indeed  the  mechanism  considered,  for  instance,  by  Monin 
and  Piterbarg  (1987)  and  Reznik  (1986).  One  should  note,  however,  that  the  k~^^^ 
spectrum  is  rather  qualitative  since  it  occupies  a  relatively  small  remge  of  k  and 
therefore  should  be  considered  cautiously.  On  the  other  hand,  the  anisotropic  energy 
spectrum  and  particularly  the  power  law  k~^^^  along  <^  =  0,  tt  are  qualitatively  new 
predictions  of  the  RG  theory  which  are  yet  to  be  compared  with  data. 


Figure  4a.  Compensated  energy  spectrum  E(k)k^^^. 


spectrum 


Figure  4b,  Compensated  energy 


E{k)ky^. 
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5.  TWO-PARAMETRIC  TURBULENCE  CHARACTERISTICS; 
EDDY  VISCOSITY  AND  EDDY  ^ 

Although  i/(k)  is  derived  from  the  bare,  or  molecular  viscosity  j/o,  it  is  not  what 
is  often  comprehended  as  an  “eddy”  viscosity.  According  to  (9),  i/(k)  and  ^  are 
parts  of  the  renormalized  Green  function  Gr(k)  that  describes  the  response  of  the 
mode  ^(k)  to  stochastic  forcing  at  the  same  wave  number,  Mathematically, 

Gr{k)  is  a  response  function  which  can  be  formally  calculated  from  (9)  by  taking 
a  functional  derivative  of  ^(k)  with  respect  to  the  forcing  ^(fc),  and  /3  and  i/(k) 
are  thus  response  parameters.  They  allow  one  to  czJculate  the  vorticity  correlation 
function,  [/(k,Lo)  in  (16),  and  the  energy  spectrum  (17),  but  they  do  not  directly 
relate  to  enstrophy  and  energy  transfers  and  dissipation.  Furthermore,  [i/(k)I:^]~* 
can  be  viewed  as  a  characteristic  time  scale  of  information  loss  at  given  k  caused 
by  non-linear  scrambling  of  all  other  modes  (see  Dannevik  et  al.,  1987).  Therefore, 
i'(k)  may  be  interpreted  as  an  eddy  damping  parameter  and  is  substantially  a  one- 
point  turbulence  characteristic.  In  this  Section,  proper  “eddy”  parameters  will  be 
introduced  and  it  will  be  shown  how  they  can  be  calculated  using  RG-based  response 
pzu'ameters. 

To  analyze  energy  and  enstrophy  transfer,  one  needs  to  consider  two-point 
characteristics  that  account  for  interactions  between  a  given  large-scale  mode  k  <  k^ 
and  all  SGS  modes  k  >  kc,  where  kc,  as  before,  is  the  moving  dissipation  cutoff.  A 
fundamental  characteristic  of  this  kind,  the  two-parametric  viscosity  i/(l:|fcc),  was 
introduced  by  Kraichnan  (1976)  for  isotropic  2D  and  3D  turbulence  based  upon 
one-time,  two-point  correlation  functions;  it  characterizes  the  energy  transfer  from 
all  SGS  modes  to  a  resolved  mode  with  the  wave  number  k.  This  approach  is 
sufficient  if  a  system  does  not  support  waves,  i.e.,  if  the  system  does  not  have 
dispersion,  or  its  Green  function  is  real  at  w  =  0.  If  a  system  is  anisotropic  and,  in 
addition,  waves  are  present,  zis,  for  instance,  the  Rossby  wave  term  in  (3),  then,  as 
suggested  by  Kemeda  and  Holloway  (1992;  also  this  volume),  a  two-point,  two-time 
vorticity  correlation  function,  U{k,t,t')  =  (^(k, <)C(— k,t')),  should  be  considered. 
Here  U{k.,t,t')  is  described  by  a  von  Karman-Howarth-type  equation 

{di  +  il3kjk^  +  uk^)  U{k,t,t')  =  T(k,<,t'),  (18) 

where  T(k,t,t')  is  the  two-time,  anisotropic,  spectral  enstrophy  transfer  function. 
Assuming  quasi-stationarity,  the  dependence  on  t  is  negligible  compzured  to  that  on 
t  —  and  will  be  ignored.  When  the  limit  t  —*  t'  is  considered,  then  T(k,t,t)  is 
complex  and,  as  will  be  seen  later,  describes  the  effect  of  unresolved  on  resolved, 
or  explicit,  modes  of  motion,  which  results  in  a  modification  of  both  i/  and  0  in 
(18).  Generalizing  Kreiichnan’s  (1976)  definition  for  a  spectrally  anisotropic  ^- 
plane  turbulence,  one  can  introduce  two  two-parametric  characteristics,  i/(klA:c) 
imd  ^{k\kc): 
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where 


i/(k|I:c) 


8?[r(k|ibJ] 

kW{k)  ’ 


mkc)  =  - 


a[r(k|i:c)]fc^ 

krU(k) 


(19) 

(20) 


rmc)  =  f  I  e-k,p.,(p^  -  ,^)sinQ  f^Vf  t7(p)£/(q) 

J  ja  p  q 

•2  2  l2  2 

- jf.Tg2  +  ~  fc2p^  ^(P)^(^^)  +  3  similar  terms  dpdq.  (21) 

Here,  0_k^p,q  is  a  complex  triad  interaction  characteristic;  its  real  part  is  the  fa¬ 
miliar  triad  relaxation  time  while  the  imaginary  part  describes  the  SGS  effect  on 
phase  properties  of  the  resolved  modes.  Also,  a  is  the  angle  between  the  vectors  p 
and  q,  and  J denotes  integration  over  all  triangles  (k,  p,  q)  such  that  p  and/or 
q  are  greater  than  kc-  Not  shown  are  the  terms  that  correspond  to  the  mirror 
image  of  the  triangle  with  respect  to  k.  The  two-parametric  viscosity  i/(k|A:c)  in 
(19)  is  a  measure  of  the  energy  transfer  from  the  unresolved  flow  scales  (turbulence 
and  Rossby  waves)  to  the  resolved  wave  number  k.  Similarly,  the  two-parametric 
/3(k|A:c)  in  (20)  accounts  for  the  total  effect  of  the  SGS  processes  on  the  resolved 
Rossby  wave  with  the  natural  frequency  pki/k"^  and  dispersion  relation  (2).  Such  a 
Rossby  wave  frequency  shift  has  been  discussed  by  Holloway  (1986)  and  calculated 
by  Kaneda  and  Holloway  (1992;  this  volume)  using  the  Lagrangian  renormalized 
approximation  in  the  jissumption  of  small 

The  two-parameter  quantities  i/(k(Ar<;)  and  /?(k|A:c)  thus  defined  generalize  the 
notion  of  eddy  viscosity  for  flows  that  involve  both  turbulence  and  waves.  Here  we 
suggest  that  in  LES  of  /?- plane  turbulence,  r'(k|fcc)  and  P{k\kc)  should  be  used  as 
eddy  viscosity  and  eddy  /3,  respectively. 

Another  useful  interpretation  of  the  response  and  eddy  parameters  is  in  associ¬ 
ating  the  former  with  the  effective  dispersion  relation  (similar  to  the  bare  dispersion 
relation  (2))  for  SGS  modes  and  the  latter  with  the  effective  dispersion  relation  for 
resolved  modes. 

Different  spectral  closures  provide  different  expressions  for  0-k,p,q  but  they  all 
involve  eddy  damping  characteristics  that  should  be  found  in  conjunction  with  (3). 
This  leads  to  the  necessity  either  to  introduce  phenomenological  considerations  to 
parameterize  the  eddy  damping  or  to  solve  a  coupled  field  problem  which  is  a  diffi¬ 
cult  task,  particularly  when  spectral  anisotropy  and  waves  etre  present.  Kraichnan 
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(1976)  has  solved  this  problem  numerically  using  the  TFM  for  isotropic  2D  turbu¬ 
lence,  while  Holloway  and  Hendershott  (1977)  applied  TFM  to  /?-plane  turbulence 
assuming  weak  anisotropy. 

The  function  ©_i,,p,q  can  be  calculated  using  the  RG-derived  response  pareun- 
eter  i/(k)  or  renormalized  Green  function  Gr(k,u/);  it  can  be  shown  that  in  the 
lowest  nontrivial  order  of  the  6-expansion 


=  5  [G;'(-k,0)  +  Gr‘(P,0)  +  G;'(q,0)]  ' 

_  1  (i^  +  t^p  -i-  »^q)  -  i(u>-k  +  +  u;q) 

2(1^  "I"  +  *^q)^  +  (‘*^-k  + 

where  iq,  =  i/(k)fc^,  and  =  l3kz/k''^. 

The  idea  of  using  the  RG-derived  response  parameters  in  second  order  spec¬ 
tral  closures  was  first  suggested  by  Dannevik  et  al.  (1987);  in  particular,  they 
showed  that  the  Eddy-Damped,  Quasi-Normal,  Markovian  (EDQNM)  approxima¬ 
tion  (Orszag,  1977)  is  obtained  from  the  RG  theory  in  the  lowest  nontrivial  order 
of  the  6-expansion. 

The  use  of  RG-derived  response  parameters  in  second  order  spectral  closures 
appears  to  be  a  powerful  idea  because  renormalized  Green  functions  in  RG  are 
decoupled  from  correlation  functions  and  thus  can  be  calculated  independently. 

From  the  point  of  view  of  practical  applications,  the  RG-based  spectral  clo¬ 
sures  and  thus  eddy  parameters,  can  be  calculated  in  a  self-consistent  way  free 
of  phenomenological  approximations  with  full  account  for  spectral  anisotropy  and 
turbulence- wave  interactions.  The  models  implementing  such  eddy  parameters 
should  have  the  correct  SGS  physics;  they  are  expected  to  perform  rather  well 
in  a  variety  of  complicated  flows  typical  of  physical  oceanography,  in  both  eddy¬ 
resolving  and  non-eddy- resolving  configurations. 

In  the  limit  (iq<  -|-  i/p  -I-  i/q)/(u;_k  +  t*;p  -|-  Wq)  —*  0,  ©-k,p,q  reduces  to  a  6- 
function,  ©-k,p,q  —*  ’•■6(u’_k  +u;p  -l-Wq),  thus  reveeiling  the  resonance  condition  for 
wave  triads  broadened  by  turbulence.  In  this  limit,  one  recovers  the  approximation 
of  weakly  non-linear  waves  (Holloway,  1986;  Carnevale  and  Martin,  1982;  Salmon, 
1982;  Reznik,  1986)  that  leads  to  the  kinetic,  or  Boltzmann,  equation  for  waves. 

In  Fig.  5,  we  plot  i/(fc|fcc)  for  isotropic  2-D  turbulence  (^  =  0)  calculated  using 
the  RG-based  response  function  v{k)  in  (10).  As  in  Kraichnan  (1976),  i/(fc|l:c)  has 
a  positive  cusp  near  kc  and  becomes  negative  as  k  —*  0.  Numerical  values  of  the 
RG-based  i/(A:|tc)  are  close  to  those  derived  by  Kraichnan  (1976). 
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Figure  5.  Normalized  two-piirametric  viscosity  u{k\kc)/\v{0\kc)\ 
for  isotropic  0  =  0)  2D  turbulence. 


In  Fig.  6a,  we  plot  the  angle-dependent,  RG-based  i/(klfcc)  for  isotropic  2D 
turbulence  {P  =  0).  Obviously,  the  result  plotted  here  is  the  body  of  revolution 
formed  by  the  curve  shown  in  Fig.  5.  The  isotropy  of  Fig,  6a  is  broken  in  Fig.  6b, 
where  t/(klfcc)  for  /  0  is  shown.  For  fe/fcc  >  0.6,  or  kfkfi  >  1.0,  the  ^-effect  is  not 
pronounced  and  i/(k|I:c)  behaves  quite  simileirly  to  isotropic  2D  turbulence;  there  is  a 
positive  cusp  and  then  t/(k|I:c)  becomes  negative.  As  ib  — »  0,  the  effect  of  the  /3-term 
becomes  stronger;  »/(klfcc)  remains  negative  in  the  vicinity  of  ^  =  ±?r/2  but  increases 
in  other  directions.  The  negativity  of  t/(k|I;c)  along  =  ±ir/2  indicates  a  strong 
inverse  energy  transfer  to  these  directions  which,  in  physical  space,  correspond  to 
energy  funneling  into  zonal  flows  v  =  (t;x(y)i  0).  This  demonstrates  that  zonal  flows, 
typical  of  both  Earth  and  planetary  circulations  (Ingersoll,  1990)  result  from  and 
are  sustained  by  the  self-organization  of  the  quasi- 2-D  turbulence  on  the  /3-plane. 

To  single  out  the  mechanism  causing  i/(k|I:c)  to  remain  negative  along  <f>  — 
±7r/2  for  small  k,  or  the  mechanism  of  zonalization  in  the  physical  space,  i/(k|A;c) 
was  calculated  with  the  RG-based  vorticity  correlation  function  (16)  for  isotropic 
turbulence,  in  which  the  non-zero  /3-term  was  retmned  only  in  the  triad  relaxation 


^-PLANE  TURBULENCE 


441 


Figure  6a.  Two- parametric  viscosity  i'(kl/:c)  normalized  by  its  max¬ 
imum  value  for  isotropic  (/?=0)  2D  turbulence. 


characteristic  0-k,p,q  in  (21).  In  Fig.  6c,  we  plot  i/(k|fcc)  in  this  case  and  observe 
the  same  general  features  as  the  two-parametric  viscosity  calculated  with  the  full 
model  (cf.  Fig.  6b).  Particulzurly  strong  negative  values  along  <t>  =  ±n/2  are  also 
present  in  Fig.  6c.  This  result  indicates  that  the  energy  funneling  into  zonal  flows 
is  more  the  result  of  the  ^-effect  on  ©-k.p.q  than  on  the  correlation  function  U(k). 

In  Fig.  7,  we  plot  ^(k\kc)/l3.  Similarly  to  the  two- parametric  viscosity, 
this  characteristic  also  reveals  a  positive  cusp  for  k/kc  as  1,  but  unlike  i/(k|Jbc), 
/9(k|I:c)  varies  significantly  at  large  k]  it  is  larger  inside  the  sectors  (7r/4,37r/4)  and 
(5t/4,  77r/4)  than  outside.  Since  /9(k|fcc)  is  in  fact  a  correction  to  the  result  plot¬ 
ted  in  Fig.  7  indicates  that  at  k/kc  \  the  SGS  contribution  to  ^  is  comparable 
to  ^  itself.  In  the  limit  k  —*  0,  the  SGS  contribution  to  0  decreases;  the  eddy  /? 
remains  smsdl  for  all  directions. 

6.  ANISOTROPIC  ENERGY  TRANSFER 

The  spectral  energy  transfer,  — 7^(kjfcc)  =  2A:*i/(k|fcc)E(k),  is  plotted  in  Fig. 
8.  By  definition,  7^(k|^c)  accounts  for  the  total  energy  transfer,  that  is  due  to 
turbulence  and  non-linear  waves.  As  fc  — »  0,  — 7^(k|fcc)  approaches  0  for  all  di¬ 
rections  except  in  the  vicinity  of  ^  =  ±7r/2  and  ^  =  0,  tt,  where  it  develops  four 
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i^(k|Ar<.) 


Figure  6b.  Two-pcirametric  viscosity  i/(klfcc)  normalized  by  its  max¬ 
imum  value  for  ^-plane  turbulence.  Here  kc  =  l.TSJk^j. 


negative  dips.  The  dips  along  <f>  =  ±7r/2  have  been  identified  earlier  with  the  flow 
zonalization.  The  interpretation  of  the  other  two  dips  is  more  subtle.  As  was 
shown  in  Fig.  2b,  the  region  in  the  vicinity  of  ^  =  0,  tt  is  strongly  dominated  by 
Rossby  waves.  It  appears  now  that  the  energy  of  these  zonally  propagating  waves  is 
sustained  by  the  zinisotropic  transfer.  As  k  0,  the  inverse  energy  cascade  becomes 
less  efficient  because  wave  triads  should  satisfy  an  additional  resonance  condition. 
However,  since  total  energy  must  be  conserved,  the  increasing  amount  of  energy 
funnels  into  <^  =  0,  tt  and  <f>  =  ±7r/2.  Such  a  large-scale  flow  organized  into  zonal  jets 
and  zoneilly  propagating  Rossby  waves  agrees  with  results  of  numerical  simulations 
(Rhines,  1975;  Williams,  1978;  Yoden  and  Yamada,  1993).  The  energy  flux  into 
^  =  0,  TT  is  manifested  in  the  annihilation  of  large-scale  eddies  due  to  radiation  of 
their  energy  by  Rossby  waves.  The  tendency  of  the  /0-efFect  to  destroy  coherent 
vortex  structures  has  been  demonstrated  in  numericeJ  simulations  (Maltrud  and 
Vallis,  1991).  The  Rossby  waves  radiation  does  not  occur  only  for  structures  with 
—*  0,  i.e.,  zonal  jets,  which  thus  become  an  additional  attracting  large-scale  flow 
configuration. 
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Figure  6c.  Two- parametric  viscosity  i/(k|  tc )  normalized  by  its  max¬ 
imum  value.  Here  0^0  but  U(k)  is  isotropic.  To 
better  resolve  the  region  of  small  fc,  kc  is  reduced  to 
kc  =  0.66fc^. 

One  may  apply  these  results  to  tackle  the  problem  of  the  Gulf  Stream  separation 
and  maintenance  from  the  point  of  view  of  non-linear  dynamics.  An  energetic  jet 
stream  is  formed  due  to  the  western  intensification  and  flows  northweu-d  along  the 
east  coast  of  the  US.  Topographic  deflection  and,  say,  adverse  pressure  gradient 
(Haidvogel  et  al.,  1992)  at  Cape  Hatteras  facilitate  the  funneling  of  the  jet’s  energy 
in  the  zonal  direction  such  that  it  leaves  the  coast.  The  energy  funneling  mechanism 
sustmns  this  organized  jet  flow  in  the  open  ocean  and  is  in  fact  responsible  for  the 
Gulf  Stream’s  existence.  Such  a  barotropic  picture  of  the  Gulf  Stream  separation  is, 
of  course,  an  oversimplification  because  baroclinicity  plays  an  important  role  in  the 
stream’s  dynamics.  However,  the  present  results  strongly  indicate  that  Gulf  Stream 
separation  and  maintenance  may  be  facilitated  by  essentially  non-linear  processes, 
a  direction  almost  unexplored  in  current  Gulf  Stream  research  (see  the  survey  by 
Hmdvogel  et  aJ.,  1992). 
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Figure  7.  Two-parametric  /3(k|fcc)/y8  for  /8-plane  turbulence.  Here 
kc  =  1.73kff. 

7.  APPLICATION  OF  THE  RG  THEORY 
FOR  PRACTICAL  OCEANOGRAPHIC  SIMULATIONS 

The  present  results  can  be  used  for  large  eddy  simulation  of  )8-plane  turbulence. 
For  this  purpose,  kc  should  be  identified  with  the  boundary  between  explicit  and 
subgrid  sceiles,  and  t/(k|A:c)  and  /8(k|A;c)  should  be  used  eis  spectral  eddy  viscosity 
and  eddy  respectively.  These  and  similar  eddy  parameters  can  also  be  adopted 
as  SGS  characteristics  in  models  of  large-scale  ocean  circulation.  There  are  two 
obvious  difficulties,  however,  that  seem  to  be  able  to  hamper  the  application  of 
these  eddy  parameters  to  practiczd  oceanographic  problems: 

•  All  the  derivations  here  are  performed  in  Fourier  space  while  the  majority  of 
OGCMs  are  developed  in  physical  space; 

•  The  presence  of  negative  viscosity  may  lead  to  inherent  numerical  instability  of 
OGCMs. 

Here,  it  will  be  shown  how  these  both  problems  can  be  resolved  for  the  example  of 
isotropic  2D  turbulence.  In  that  case,  the  two-parametric  viscosity,  shown  in  Fig. 
5,  can  be  interpolated  by  a  polynomial  expression  in  powers  of  (k/kc)^: 
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-^e(k|A:e) 


Figure  8.  Spectral  energy  transfer  for  /?- plane  turbulence,  — 7^(k|lrc), 
normalized  by  its  maximum  value.  Here  kc  =  1.73Jk^. 


\  c  /  \  ^  y 

+  0.5|K0|*:c)l 

^  -|-0.125|i/(0|Ml(j^ 

-]  +...  ,  (23) 

where 

l/(0|fcc)  =  ( 

1  - 1)  K^c), 

(24) 

which  gives 

l/(0|fcc)  =  ■ 

<  0 

(25) 

for  the  energy  sub-range  and 

= 

-u{kc)  <  0 

(26) 

for  the  enstrophy  sub- range.  These  results  are  consistent  with  Kraichnan’s  (1976) 
assertion  that  at  low  k,  u{k\kc)  reaches  saturation  at  negative  values.  Equations 
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(24-26)  not  only  reconfirm  Kraichnan’s  (1976)  result,  but  quantify  it  in  terms  of 
I'ikc). 

Approximated  by  the  first  two  terms,  the  inverse  Fourier  transform  of  (23) 
produces  a  dissipation  term  in  ( 1 )  of  the  form 

-i/i(A)V'‘C-MA)V^C,  (27) 

where  A  is  the  grid  resolution  in  physical  space.  Equation  (27)  is  a  linear  superpo¬ 
sition  of  a  negative  Laplacian  and  a  positive  biharmonic  viscosity.  The  negative 
Laplacian  viscosity  is  the  destabilizing  term  that  may  be  responsible  for  initiating 
and  maintaining  the  eddy  activity,  while  the  biharmonic  (and  higher  order)  friction 
term  provides  an  efficient  dissipation  mechanism  that  also  insures  numerical  stabil¬ 
ity.  Equations  of  similar  structure  have  appeared  in  different  branches  of  physics 
(Kuramoto  and  Tsuzuki,  1976;  Kuramoto,  1984;  Sivashinsky,  1979,  and  references 
therein)  and  are  known  as  Kuramoto-Sivashinsky-type  equations.  A  very  important 
feature  of  these  equations  is  that  they  have  regular  solutions  due  to  the  stabilizing 
biharmonic  term  and  thus  produce  well-posed  problems  despite  the  presence  of  the 
negative  Laplacian. 

Models  of  horizontal  mixing  used  in  exi.sting  OGCMs  mostly  employ  either  pos¬ 
itive  Laplacian  or  negative  biharmonic  operators  with  constant  or  Smagorinsky-type 
(Smagorinsky,  1963,  1993)  eddy  viscosities.  The  closest  model  to  that  of  (23)  with  a 
scale-selective  representation  of  the  SGS  processes  used  in  today’s  geophysical  simu¬ 
lations  is  that  given  by  the  anticipated  potential  vorticity  (APV)  method  (Sadourny 
and  Basdevant,  1985),  in  which  potential  enstrophy  dissipation  is  parameterized  by 
a  diffusion  operator  in  the  form  of  an  iterated  Laplacian,  A;~^"(— V^)",  r  being 
a  characteristic  eddy  turnover  time  at  the  grid  scale,  and  n  ~  8.  A  detailed  emalysis 
of  the  APV  method  is  given  by  Vallis  and  Hua  (1988)  who  pointed  out  that  its 
major  advantage  is  in  that  it  conserves  energy  and  dissipates  enstrophy.  By  virtue 
of  energy  conservation,  the  APV  method  produces  effective  eddy  viscosities  that 
are  cusp-like  near  kc  and  negative  at  small  k.  However,  its  implementation  involves 
a  certain  degree  of  phenomenology,  for  instance,  with  respect  to  determination  of 
T.  Another  disadvantage  of  the  APV  method  is  its  lack  of  Galilean  invariance. 

In  practical  situations  where  spectral  anisotropy  and/or  waves  may  be  present, 
the  RG-baaed  parameterization  of  SGS  processes  will  not  be  as  simple  as  that 
given  by  (23),  (24),  (27).  As  an  example,  in  the  case  of  /3-plane  turbulence,  the 
two-parametric  viscosity  shown  in  Fig.  6b  is  a  complicated  cylindrical  surface  with 
pronounced  anisotropy.  The  corresponding  viscosity  operators  in  physical  space  will 
also  be  complicated  and  will  acquire  tensorial  properties.  However,  these  operators 
can  be  calculated  from  the  RG  theory  in  a  self-consistent  way  with  no  appeal  to 
phenomenological  or  empirical  considerations;  they  incorporate  correct  physics  that 
include  Galilean  invariance,  conservation  of  energy,  dissipation  of  enstrophy,  Rossby 
wave- turbulence  interaction  and  the  negative  viscosity  phenomena  that  long  have 
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been  known  to  play  an  important  role  in  geophysical  and  planetary  circulations 
(Starr,  1968). 


8.  CONCLUSIONS  AND  DISCUSSION 

We  have  developed  a  self-consistent  theory  of  /3-plane  turbulence  using  the  RG 
theory.  This  theory  is  rooted  in  the  basic  physics  of  2D  turbulence  and  Rossby 
waves  and  recovers  most  of  the  known  theoretical  and  numerical  results  on  /3-plane 
turbulence. 

At  large  wave  numbers,  the  flow  is  turbulence  dominated  and  reveals  isotropic 
2D  turbulence-like  behavior.  With  decreasing  1%  the  ^-effect  progressively  becomes 
more  pronounced.  The  transition  from  2D  turbulence  to  the  Rossby  wave  dom¬ 
inated  regime  takes  place  near  the  dumb-bell  curve  (15)  that  is  the  anisotropic 
generalization  of  the  transitional  wave  number  introduced  by  Rhines  (1975).  Inside 
the  dumb-bell  curve,  the  flow  dynamics  approach  the  limit  of  weakly  interacting 
non-linear  Rossby  waves  described  by  the  kinetic,  or  Boltzmann,  equation. 

With  decreasing  fc,  the  energy  spectrum  undergoes  a  smooth  transition  from  the 
isotropic  Kolmogorov  law  to  a  strongly  anisotropic  law  along  (/>  =  0,  tt. 

The  energy  transfer  has  a  cusp-like  behavior  near  the  wave  number  kc  [defined  above 
(9)]  and  becomes  negative  for  smaller  k  revealing  inverse  energy  transfer  typical  of 
2D  turbulence.  At  yet  smaller  fc,  the  inverse  transfer  diminishes  for  all  directions 
but  </»  =  0,  TT  and  </»  =  db7r/2,  where  it  remains  negative  for  all  k  indicating  that  the 
flow  undergoes  self-organization  into  zonal  flows  and  zonally  propagating  Rossby 
waves. 

The  eddy  viscosity  and  eddy  /3  parameters  for  LES  of  /3-plane  turbulence  have 
been  obtained  using  the  two-time  von  Karman-Howarth-type  equation  for  the  vor- 
ticity  correlation  function  and  fully  account  for  the  turbulence- Rossby  wave  interac¬ 
tion  and  spectral  anisotropy.  When  converted  back  to  physical  space,  the  spectral 
SGS  representation  produces  a  Kuramoto-Sivashinsky-type  equation,  with  negative 
Laplacian  term,  a  positive  biharmonic  friction,  and,  possibly,  higher  order  dissipa¬ 
tive  terms.  The  negative  Laplacian  is  a  destabilizing  term  which  is  directly  respon¬ 
sible  for  initiating  and  sustaining  the  eddy  activity  in  the  world  ocean,  atmosphere 
and  other  quasi-2D,  high  Reynolds  number  systems.  The  higher  order  biharmonic 
and  hyperviscosities  provide  efficient  scale-selective  dissipation  mechanisms  that  en¬ 
sure  the  well-posedness  and  numerical  stability  of  the  problem.  These  results  have 
direct  implications  for  numerical  modeling  of  the  oceanic  circulation.  The  action  of 
negative  viscosity  is  not  only  a  persistent  source  of  the  SGS  energy,  but  also  a  vehicle 
for  transporting  the  SGS  energy  to  ever  larger  scedes.  Compounded  with  spectral 
sinisotropy  and  tendencies  to  self-organization,  the  negative  viscosity  phenomena 
may  play  a  very  important  role  in  the  ocean  circulation  physics.  It  is  clear  there¬ 
fore  that  SGS  representation  is  an  important  element  in  both  eddy  resolving  and 
non-eddy- resolving  models.  The  RG  theory  of  turbulence  provides  a  self-consistent 
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framework  capable  of  addressing  some  of  the  key  issues  of  SGS  parameterization 
for  oceanographic  modeling;  the  RG-based  SGS  models  are  Galilean  invariant,  con¬ 
serve  energy,  dissipate  enstrophy,  include  Rossby  wave-turbulence  interaction  and 
incorporate  the  negative  viscosity  phenomena. 

The  RG-based  SGS  parameterization  of  y9-plane  turbulence  indicates  that  the 
moving  dissipation  cutoff  kc  can  be  chosen  quite  close  to  k^,  the  scales  at  which 
the  Rossby  wave  dynamics  become  dominant.  This  fact  gives  rise  to  the  hope  that 
if  the  RG-based  SGS  parameterization  is  developed  for  processes  at  scales  of  the 
deformation  radius,  the  grid  resolution  could  be  of  the  order  of  the  deformation 
radius  itself.  This  would  lead  to  creation  of  a  non-eddy-resolving  model  in  which 
eddy  effects  are  properly  incorporated. 

The  present  study  is  concerned  with  only  a  “building  block”  geophysical  flow, 
viz.  ;3-plane  turbulence.  Real  oceanic  flows  are  much  more  complicated.  However, 
the  general  approach  developed  in  this  paper  may  be  generalized  for  more  com¬ 
plicated  situations.  In  particular,  the  effects  of  topography  and  stratification  cam 
be  incorporated  in  the  vorticity  equation  (Charney  and  Stern,  1962;  Rhines,  1979; 
McWilliams,  1989)  which  enables  the  extension  of  the  RG  analysis  to  these  flows. 
For  example,  barotropic  quasi-2D  turbulence  over  topography  can  in  some  cases 
be  described  by  the  barotropic  vorticity  equation  with  a  j8-like  term  due  to  the 
topographic  gradient.  As  for  ^-plane  turbulence,  the  topographic  /S-term  generates 
mean  flows  and  topographic  Rossby  waves  in  the  direction  normal  to  the  topographic 
gradient,  with  direct  implications  for  coastal  oceanography.  These  conclusions  are 
supported  by  the  results  of  direct  numerical  simulations  (Vallis  and  Maltrud,  1993). 
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ABSTRACT 

Frequency  shifts  in  Rossby  wave  propagation  due  to  nonlinear  interactions  in 
geostrophic  (beta-plane)  turbulence  are  studied  by  direct  numerical  simulations 
and  a  statistical  closure  theory.  The  shifts  are  of  systematic  sign  and  can  be  quite 
large  as  compared  with  the  linear  Rossby  frequency.  Under  certain  conditions,  the 
closure  equations  yield  a  simple  approximation  for  the  shifts.  An  explanation  of  the 
shifts  is  given  by  a  model  that  includes  oscillating  random  sweeping  and  strain  of 
large  eddies. 


INTRODUCTION 


The  beta-plane  model  is  one  of  the  simplest  turbulence  models  that  teikes  into 
account  planetary  gradient  of  Coriolis  effect,  obeying 


dc  a(c,v>) 

dt  d{x,y) 


(1) 


where  ij;  is  the  stream  function  related  to  the  fluid  velocity  as  u  =  {d^/dy^  dx)^ 
^  is  the  vorticity,  and  v  the  viscosity.  The  P  term  represents  Coriolis 

effect.  In  the  absence  of  the  nonlinear  Jacobian  term,  Eq.(l)  exhibits  just  the 
Rossby  wave  propagation,  while  in  the  absence  of  the  P  term,  Eq.(l)  is  the  two- 
dimensional  Navier-Stokes  equation.  Thus  the  model  provides  a  simple  prototype 
of  wave/turbulence  system. 

We  consider  in  this  paper  the  frequency  shifts  of  the  the  Etderian  two-time 
correlation  function  in  homogeneous  and  quasi-stationeu'y  beta-plane  turbulence. 
Under  periodic  boimda^  conditions,  in  the  Fourier  space  given  by 


u(x,  <)  ==  ‘ 

k 
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the  Eulerian  correlation  spectrum  U  defined  by 


U{k,T,t)  =<  u(k,T)  •  u(-k,<)  >, 


(2) 


obeys 


+  ia;o(k)]l7(k,r,t)  =  T(k,r,t)  =  -  <  J(k,  r)0(-k,  <)  >,  (3) 

where  u;o(k)  =  —^kxfk'^  is  the  linear  frequency,  and  J  is  the  Fourier  transform  of 
the  Jacobian  term  in  Eq.(l). 

The  real  part  of  T  represents  the  energy  transfer  due  to  nonlinear  interactions; 

[^  +  2uk^]U(kJ)  =  2ReT{Ktl 

where  £^(k,<)  =  U{k,t,t)  and  r(k,<)  =  T(k,<,  <),  while  the  imaginzury  part  normal¬ 
ized  by  the  energy  spectrum  U  gives 


ImT(k,t) 

U(k,t) 


=  Aa;{k)  =  Re[u;(k)]+u;o(k), 


(4a) 


where 


u>(k)  =  J  cjU{k,u;)du/  J  U{k,(v)du;, 

U{k,i  +  T,t)  =  J  U{k,u;)exp{iuJT)du;. 


(46) 


Here  and  hereafter,  we  assume  quasi-stationarity  of  turbulence  such  that  the  t— 
dependence  of  U{k,t  +  r,  t)  is  negligible  compared  with  its  r-dependence,  and  omit 
the  argument  t  at  will.  In  the  absence  of  the  nonlinear  interew:tions,  Au;  is  zero,  and 
Ao;  is  therefore  a  measure  of  the  frequency  shifts  due  to  nonlinear  interactions. 

Direct  numerical  simulations  (DNS)  of  beta-plane  turbulence  in  planar  and 
spherical  geometries  so  far  have  suggested  that  the  shifts  are  westward  zmd  nearly 
proportional  to  kx  (see  the  review  by  Holloway,  1986).  There  have  been  theoretical 
studies  on  renormalized  frequencies  that  take  into  account  the  nonlinear  interactions 
(Legras,  1980;  Camevale  and  Martin,  1982).  However,  the  reason  for  the  observed 
shifts  remained  unknown. 

The  primary  purpose  of  this  paper  is  to  study  the  frequency  shifts  by  DNS  and 
a  two-point  closure  theory,  and  to  understand  the  reason.  As  for  the  closure  theory, 
we  use  the  Lagrangian  renormalized  approximation  (LRA;  Kaneda,  1981),  which  is 
derived  by  systematic  Lagrangian  renormalized  expansions. 
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STATISTICAL  APPROXIMATION  (LRA) 


In  a  wide  class  of  two-point  closure  theories  of  turbulence,  the  transfer  function 
T  for  homogeneous  turbulence  in  a  quasi-stationary  state  is  given  by  an  equation 
of  the  form 

Pi<l 

xKp"  -  iyv(p)V(^)  -  2(p-‘  -  -  }")j;(k)Cf(q)|,  (5) 

where  denotes  the  sum  over  p,  q  satisf3nng  p-|-q  =  k.  The  principal  difference 
between  various  closure  theories  comes  from  the  difference  of  the  triple  relaxation 
factors  9. 

The  application  of  the  LRA  to  the  jS-plane  model  equation  (1)  yields  Eq.{5) 
with 

d(-k,p,q)=/  G(-k,T)(?(p,r)G(q,r)dr,  (6) 

Jo 

where  G  is  an  appropriately  defined  Lagrangian  response  function  obeying 

+  iu.o(k)lG(k,r)  =  -2f;Jt^  rG(-q,s)*G(-q)G(k,r),  (7) 

^  p,q  P  9  Jo 

G(k,0)  =  l, 

(Kaneda,  1981;  Kaneda  and  Gotoh,  1988). 

In  terms  of  <f>  defined  by 


Eq.(7)  may  be  written  as 

_?L 

dr^ 

where 


G(k,  r)  =  exp[-^(k,  r)], 


^(^(k, r)  =  2  exp{-<^(-q, r)]U(q), 


p.q 


k'^p^q^ 


(8) 


^(k,  0)  =  0,  ^r(k,  0)  =  +  *u;o(k), 

and  we  have  used  U{q)  =  U{—q).  For  small  t,  <f>  may  therefore  be  expanded  as 
<^(k,  r)  =  [uk^T  -I-  -I- ...]  t{a;o(k)r  -|-  +  ...], 


(9a) 
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where 

A(k)  =  253|kxq|*-^C^(q),  (94) 

p.q  ^ 

B(k)  =  -2^f;|kxq|*^i;(q),  (9c) 

p.q  ^ 

ic  =  k/k,  and  we  have  used  pxq  =  kxqfork  =  p  +  q. 

In  the  following  section,  we  need  an  estimation  for  the  imaginary  part  of  the 
triple  relaixation  factor  d(— k,p,q)  with  it  »  9.  If  ^(p)  ~  ^(k)  for  p  =  k  —  q  and 
fc  ~  p  >  9,  then  Eq.(6)  gives 

^(-k,p,q)  ~  ^(-k,k,q)  =  /  exp{-2^H(k,T)  -  ^fl(q,T)  -  t<^/(q,T)]dT,  (10) 

Jo 

for  k  q,  where  and  <t>i  are  the  real  and  imaginary  parts  of  respectively, 
and  we  have  used  <^ii(k)  =  k),  and  the  term  9i/(k)  has  disappeared  because 

Mk)  +  ^/(-k)  =  0. 

In  order  to  get  a  rough  estimate  of  the  imaginary  part,  we  assume  that  the  terms 
of  higher  order  in  r  may  be  neglected  when  Eq.(9a)  is  substituted  into  Eq.{10),  i.e., 
we  may  substitute 

</>R(k)  ~  +  A(k)T^,  0/(k)  ~  u;o(k)T  +  S(k)T’,  (11) 

into  Eq.(lO).  This  substitution  does  not  imply  that  the  value  of  <i>  itself  is  assumed 
to  be  well  approximated  by  Eq.(ll)  in  the  entire  range  of  r.  It  is  clear  that  Eq.(ll) 
may  be  a  poor  approximation  for  large  r,  although  it  may  be  good  for  small  r.  The 
substitution  implies  that  we  assume  the  magnitude  of  the  integremd  in  Eq.(lO)  to 
be  sufficiently  small  for  large  t  (where  Eq.(ll)  may  be  wrong),  and  the  error  caused 
by  the  substitution  to  be  not  serious. 

For  the  seike  of  simplicity,  we  further  assume  that  the  anisotropy  of  the  energy 
spectrum  may  be  negligible,  or  we  may  discard  the  anisotropic  part  of  Z7,  for  example 
by  the  smallness  of  /3.  Then  A(k)  is  a  function  of  only  the  magnitude  k  and,  Eq.(9c) 
may  be  reduced  to 

B(k)  =  u;o(k)C(I:), 

where  C  is  a  function  of  only  k. 

The  substitution  of  Eq.(ll)  into  Eq.(lO)  then  gives 

^(-k,  P,  q)  ~  /  exp{-i/[2*:^  +  9^]t  -  [2A{k)  +  A(g)]r^  -  zu;o(q)r[l  +  C(9)T^]}dr, 
Jo 
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where  we  have  put  A(k)  =  A{k).  If  the  viscous  term  is  negligible,  and 

2^1(1)  >  A(,),  24(4)  >  |C(4)|,  124(t)l'/2  >  K(q)|,  (12<.,6,c) 

this  gives 

Imd(-k,p,q)  ~  -7(fc)u;o(q),  (13a) 

where 

fOO 

y(k)  =  I  Texp[-2A(I:)T^]dr.  (136) 

Jo 

Although  it  is  not  easy  to  estimate  A  and  C  under  general  conditions,  sim¬ 
ple  estimations  axe  possible  under  certain  idealized  conditions  and  assumptions  as 
follows.  Let  E  and  Z  be  the  total  energy  and  enstrophy  defined  by 

q  q 

respectively,  and  let  us  assume  that  most  energy  and  enstrophy  are  from  wavenum¬ 
bers  near  and  kz,  respectively,  so  that  E  and  Z  may  be  approximated  as 

q<KE  <t<Kx 

where  Ke  and  Kz  are  of  the  szune  order  with  zind  kzi  respectively.  If  fc  >  kz-, 
and  the  dominant  contributions  to  A(k)  and  B(k)  in  Eqs.(9b)  and  (9c)  come  from 
the  domain  q  <  Kz,  then  it  is  shown  after  some  algebra  that  Eqs.(9b,c)  and  (14b) 
give 

A(fc)  ~ -Z,  C(fc)  ~ -Z,  (15a,  6) 

while  the  dimensional  consideration  based  on  Eqs.(9b,c)  and  (14a)  gives 

A{q)  =  0{klE),  C{q)  =  0{klE), 

for  g  ~  fcg  <  A:,  provided  that  the  dominemt  contributions  to  A{q)  and  C{q)  are 
from  the  energy  containing  remge. 

The  conditions  (12a)  emd  (12b)  are  then  well  satisfied  if 

Z:>k%E.  (16) 

The  numerical  factor  2  in  front  of  A{k)  in  Eq.(12)  is  insignificant  in  the  estimation 
of  the  order  of  magnitude  of  the  terms,  but  may  be  significant  numerically  in  real 
DNS  of  limited  resolution,  where  the  strong  inequalities  in  Eqs.(12)  and  (16)  may 
hold  only  in  a  weak  sense,  i.e.,  the  strong  inequality  is  to  be  changed  to  the 
weaker  ”>”. 
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SIMPLIFIED  APPROXIMATION 


Let  r'^(k|Ar)  be  the  contribution  from  the  interactions  aunong  the  modes  (k,  p,  q) 
in  Eq.(5)  with  p  or  q  <  K.  The  contribution  can  be  estimated  in  the  same  way  as 
Kraichnan  (1966).  Since 

(p"  -  «")I(p"  -  ~  -*"(q  ■  Vk)|*^£/(k)i, 

for  k  =  p  +  q  and  A:  ~  p  >  we  have  for  k  >  A', 

A 

T<(k|/f)  ~-Yi  «(-!<, P.q)|t  X  q|^Cf(q)(q  •  Vk)H-"f/(k)l,  (17) 

5<A' 

where  the  sum  over  q  satisfying  k  =  p  +  q  and  q  <  K. 

In  order  to  get  a  simplified  approximation  for  the  frequency  shift  Au;(k)  for  large 
wavenumber  k,  we  introduce  here  the  following  three  assumptions. 

(I) :  The  dominant  contributions  to  ImT(k)  for  k  >  Kz  come  from  nonlocal  inter¬ 
actions  with  low  wavenumbers,  so  that 

Imr(k)~Imr<(k|/r^). 

(II) :  P  is  not  very  large  so  that  the  anisotropy  of  the  energy  spectriun  U  is  weak 
and  we  may  therefore  neglect  its  anisotropic  part,  i.e.,  we  may  put 

f/(q)  ~  U{q). 


(Ill):  The  imaginary  part  of  the  triple  relaxation  for  k  ;>  Kz  may  be  approximated 
by  Eq.(13)  in  the  estimation  Im  r'^(k|A'2). 

Under  the  eissiunption  (III),  Eq.(17)  yields 


Imr<(k|Kz)~7(*)  Y.  ‘^o(q)|k  X  q|  V(q)(q  •  Vk)[t^£/(k)l, 

q<Kz 

and  xmder  the  isotropic  assumption  (II)  this  may  be  further  reduced  to 


(18) 


ImT(k|A:z) 


8 


E  ^(9)- 


q<Kz 


dk 


(19) 


A  rough  estimate  of  '){k)  may  be  obtained  by  substituting  Eq.(15a)  into  Eq.(13b), 
which  results  in 


(20) 
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If  we  further  assume  Eq.(14a),  then  the  assumption  (I)  and  Eq.(19)  give 

l3k^E  d[kW{k)] 


ImT(k)  ~  -■ 


12Z 


dk 


and  therefore 


ImT(k)_  fik,E  d[k^U{k)] 
^  U(k)  UZUik)  dk 


(21) 


(22) 


If  U{k)  ~  fc  then  Eq.(21)  yields  for  k  »  kz, 

Au;(k)  —  kx. 

The  comparison  of  Eq.(22)  with  the  linear  frequency  yields 

Au;(k)  _  (m  —  2)  k^ 

while  the  comparison  with  the  random  sweeping  frequency,  yields 

Au>(k)  _  (m  -  2)  kji  fej; 
u'k  6  k^  k  ' 

where  kz  =  C/'^'  is  a  representative  wavelength  for  a  flow  with  an  rms  velocity 
u'  =  and  an  rms  vorticity  C'  =  and  ka  =  {l3/2u'y^^  is  a  representa¬ 

tive  wavelength  obtained  by  comparing  representative  wave  speed  with  u'  (Rhines, 
1975). 

Equations  (21)  and  (22)  suggest  that  the  frequency  shifts  are  independent  of 
the  amplitude  of  the  turbulent  flow.  However,  it  is  to  be  remembered  that  Eq.(21) 
is  based  on  the  assumption  (III)  or  Eq.(13),  and  in  the  aerivation  of  Eq.(13)  we 
have  assumed  Eq.(12)  Euid  that  the  viscosity  is  negligible.  In  the  limit  of  weak 
nonlinearity,  Eq.(13)  does  not  hold,  but 


Im0(-k,p,q)  ~  -• 


wo(q) 


(2i/A:2  +  vq^Y  -|-u;g(q)’ 

This  can  be  justified  by  noting  that  neglecting  the  right-hand  side  of  Eq.(8)  yields 

4>(k,T)  =  uk^T  -b  iti;o(k)r. 

Retracing  the  derivation  of  Eq.(21)  then  gives  for  k  ^  kz, 


Au;(k)  =  - 


0k^E  d[kW{k)] 
Z2(uk^yU{k)  dk 
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instead  of  Eq.(21),  provided  that  the  assumptions  (I)  and  (II)  are  still  valid  in  this 
limit,  and  >  l‘*^o(<l)|  for  k  >  Kz  >  q- 

It  is  also  to  be  noted  here  that  if  m  <  2  then  the  integration  of  Eq.(18)  or  Eq.(19) 
over  q  does  converge  at  low  q.  This  implies  that  the  dominant  contributions  come 
from  local  or  high  wavenumbers,  and  this  is  incompatible  with  the  assumption  (I). 
Hence  m  must  satisfy  m  >  2  unless  the  form  U(k)  ~  is  assumed  to  be  vahd 
only  in  a  local  sense. 

The  approximation  (22)  has  the  advantage  of  simplicity,  as  compared  with 
Eq.(5),  but  is  based  on  several  assumptions.  It  is  therefore  interesting  to  compare 
the  simplified  approximation  (22)  with  the  estimate  obtained  from  Eq.(5)  without 
using  the  assumptions.  In  the  next  section,  we  try  such  a  comparison  as  well  as  the 
comparison  of  the  theory  with  DNS. 


DNS  AND  NUMERICAL  SOLUTION  OF  THE  LRA 


Fields  satisfying  Eq.(l)  under  periodic  boimdary  conditions  were  generated  by 
alias-free  spectral  method  with  wavenumber  increment  Ak  =  1  in  each  of  fc*  and 
ky  directions,  and  retained  wavevector  domain  k  <  Kmazi  where  Kmax  is  about 
85.  The  initial  values  of  the  Fourier  components  u(k)  were  chosen  to  be  normally 
distributed  with  given  initial  isotropic  spectrum  U(k,t  =  0).  In  the  runs  reported 
here,  we  used  u  —  0.004  emd  E{k)  =  TrkU(k,t  =  0)  =  Ckexp{—2k/ko),  where  is 
a  constant,  and  the  constant  C  is  so  normzdized  that  E  =  1  in  each  realization.  In  a 
series  of  run  (Series  B),  ko  was  fixed  at  fco  =  5.0  and  /?  was  changed  as  ^  —  2.5,  5.0 
cind  10.0.  These  rims  are  called  here  as  B25,  B5  and  BIO,  respectively.  In  another 
series  (Series  K),  j3  was  fixed  at  =  5.0  and  fco  weis  changed  as  fco  =  2.5, 5.0  and 
10.0.  These  runs  are  ceilled  as  K25,  K5  and  KlO,  respectively. 

In  order  to  avoid  the  initial  rapidly  changing  phase,  we  started  to  take  time 
averages  after  t  =  0.8.  The  averages  here  axe  time  averages  from  t  —  0.8  to  <  =  1.0. 
In  all  the  runs,  the  time  averaged  spectrum  k*U{li)  was  observed  to  be  nearly 
isotropic,  and  the  slope  of  U  was  steeper  than  k~*  at  high  k.  The  representative 
wavenumbers  kz  =  C —  y/ZfE  and  the  total  enstrophy  in  the  rims  were  as 
follows; 


B5/K5 

B25 

BIO 

K25 

KlO 

kz 

4.64 

4.64 

4.64 

2.95 

6.55 

Z 

17.5 

17.5 

17.6 

8.13 

25.4 

Thus  kz  and  Z  are  larger  for  Wger  fco  in  Series  K,  as  would  be  expected. 

If  we  take  the  characteristic  eddy-damping  time  scale  as  r/j  ~  y/AfZZ,  which 
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is  suggested  from  Eqs.(ll)  and  (15a),  and  the  eddy  turn  over  time  as  rx  ~  27r/(^', 
then,  for  example,  for  K5/B25  they  Jire  given  by  tq  ~  0.28,  and  tt  ~  1.5.  Thus 
the  time  interval  of  the  averages  is  comparable  to  or  shorter  than  the  damping  and 
eddy  turn  over  times.  (The  time  interval  is  limited  in  our  DNS,  because  in  order 
to  avoid  extra  complexity  caused  by  the  introduction  of  external  driving  force,  we 
are  considering  here  only  freely  decaying  turbulence  in  which  the  statistics  cannot 
be  stationary  in  a  strict  sense.  Better  statistics  could  be  obtained  by  increasing 
the  number  of  realizations.  However,  a  preliminary  test  of  taking  averages  over  6 
realizations  suggested  that  the  results  are  qualitatively  not  significantly  different 
from  those  by  one  realization.) 

The  LRA  approximation  Eq.(5)  for  r(k)  with  Eq.(6)  was  also  estimated  by 
numerical  computation.  The  sums  over  p,  q  in  Eqs.(5)  and  (8)  were  computed  by 
an  alias  free  spectral  method  based  on  the  use  of  Fast  Fourier  Transform  (FFT)  as  in 
a  previous  study,  (Gotoh  and  Kaneda,  1991).  In  order  to  avoid  large  fluctuations  in 
the  simulated  energy  spectrum,  we  substituted  to  U(k)  the  isotropic  band-averaged 
as  well  as  time-  averaged  spectrum.  The  wavenumber  increment  in  the  computation 
is  Ak  =  1  as  in  DNS,  and  the  retained  wavevector  domain  was  k  <  Kmax  ~  85. 
As  a  preliminary  check,  we  computed  Aw  by  two  ways;  one  is  by  using  the  single¬ 
precision  FFT  and  the  other  by  double-precision  FFT.  Although  the  value  of  Au;(k) 
at  high  wavenumbers  was  found  to  be  very  sensitive  to  the  precision,  no  significant 
difference  was  observed  at  k  less  than  about  40.  We  therefore  present  results  only 
for  kx  <  40,  in  the  followings.  The  sensitivity  at  high  k  is  presumably  because  U (k) 
is  there  very  small  and  Aw  has  the  denominator  U(k)  as  in  Eq.(4a). 

Figures  1,2  and  3  show  the  frequency  shifts  by  DNS  and  the  LRA  in  Series  B, 
while  Figs.  1,4  and  5  show  the  shifts  in  Series  K.  In  the  figures,  the  values  by  the 
simplified  approximation  (22)  are  also  plotted,  where  the  value  m  =  7,  which  was 
guessed  from  the  energy  spectrxun  at  fc  ~  20  or  so,  is  used.  The  energy  spectrum 
is  not  rigorously  of  power  low  form  in  the  DNS,  and  this  exponent  should  not  be 
taken  too  seriously. 

Although  it  is  difficult  to  make  detailed  quantitative  comparisons  due  to  rela¬ 
tively  large  fluctuations  in  the  simulated  values  of  Aw  taken  from  short  time  interval 
as  noted  above,  the  figures  show  that  the  slope  Aw/k^  increases  with  (3  in  Series 
B,  and  decreases  with  fco,  i.e.,  with  the  total  enstrophy  in  Series  K.  The  DNS  re¬ 
sults  suggest  that  the  shifts  are  nearly  proportional  to  the  slopes  in  the  figures 
are  positive  {\.e.,Aw/kx  >  0)  and  Aa;  exhibits  only  weak  dependence  on  ky.  The 
positivity  of  the  slopes  means  that  the  shifts  are  in  the  direction  of  westward  phase 
propagation.  These  results  are  in  agreement  with  previous  studies,  (cf.  Holloway, 
1986).  The  results  of  the  LRA  as  well  as  the  simplified  approximation  (22)  are  seen 
to  agree  qualitatively  with  DNS. 
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OSCILLATING  RANDOM  VELOCITY  GRADIENT  MODEL 


In  order  to  get  some  idea  on  the  physics  underlying  the  frequency  shifts  discussed 
in  the  previous  sections,  let  us  consider  the  following  model  equation  for  the  vorticity 
C(k)  of  small  eddies; 


=  -(i(k)C{k,f)  +  (23) 

where  the  parameter  A  is  introduced  for  the  later  convenience,  fi  a  real  time- 
independent  deterministic  damping  factor  satisfying  fi(k)  =  /i(— k),  /  a  statisti¬ 
cally  homogeneous  and  stationary  white-noise  random  process  with  zero  mean  and 
/(— k)  =  /*(k),  and  V  an  operator  defined  by 

V  =  V(k,<)  =  ikaUait)  +  ik.Sabit)^, 

OKf, 

in  which  Ua  aJid  Sab  are  wavevector-independent  random  variables  with  zero  mean. 

The  Ua-  and  5a6-terms  axe  supposed  to  be  models  for  the  effects  of  random 
sweeping  and  random  straining  of  the  vorticity  field  by  large  eddies,  respectively. 
Such  a  representation  of  the  effects  of  the  large  eddies  heis  been  used  in  studies  of 
the  role  of  large  eddies  on  small  eddies,  (see,  for  example,  Townsend,  1976).  The 
fi—  term  is  supposed  to  represent  the  viscous  damping  as  well  as  the  eddy-damping 
due  to  the  nonlinear  interactions  that  are  not  taken  into  account  by  the  V  —  term. 

Under  the  existence  of  the  uniform  strain  term  {Sab  term),  the  two-time  Eule- 
rian  correlation  function,  imlike  single-time  correlation,  of  C  obeying  Eq.(23)  is  not 
homogeneous.  This  implies  that  <  ^(x,  <)C(x',  f')  >  depends  on  the  space  variables 
X  and  x'  not  only  through  x  -  x'  unless  t  =  t',  (cf.  Gotoh  and  Kaneda,  1991). 
We  consider  here  the  Fourier  transform  of  <  C(x,f)C(x',s)  >  with  respect  to  x  for 
x'  =  0,  and  define  the  spectrum  U{k,T,t)  as 

U{k,T,t)  =  J  <  C(k,r)C(p,t)  >  d^p/A:^. 

This  definition  of  U  is  equivalent  to  Eq.(2)  for  homogeneous  turbulence. 
Multiplying  Eq.(23)  with  CCPj'S)  and  taking  the  average  yield 

g 

<  C(k,<)C(P,<s)  >  =  r<;-/i(k)  <  C(k,<)C(p,f)  >  +  <  /(k,<)C(p,0  >,  (24) 

where 

=  T<:{k,p,t)  =  -A  <  V(k,<)C(k,<)C(p,<)  >  . 
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Since  the  imaginary  parts  of  the  second  and  third  terms  on  the  right- hand-side  of 
Eq.(24)  are  zero,  Eq.(24)  gives 


Re(w(k)!  = 


Imrt(k) 


(25) 


where 

Tc(k)  =  J  dp  <  r,;(k,p,f)  >, 

and  u>(k)  is  defined  in  the  same  way  as  Eq.(4b)  through  the  frequency  spectrum 
U(k,uj).  Because  the  linear  frequency  a7o(k),  i.e.  the  frequency  in  the  absence  of 
V— term,  is  zero  in  Eq.(24),  Eq.(25)  also  represents  the  frequency  shift  Aa;(k)  due 
to  nonlinear  interactions,  i.e.. 


Aw(k)  = 


ImT<^(k) 
kW{k)  ’ 


in  the  present  model  (cf.  Eq.(4a)). 

Since  Eq.(23)  is  linear  in  it  is  possible  to  solve  C  analyticedly,  but  the  expres¬ 
sion  for  Aw  would  be  then  quite  complicated.  Hence  we  try  here  a  perturbative 
expansion  of  Aw  in  powers  of  A.  When  A  =  0,  Eq.(23)  is  just  the  wellknown 
Lemgevin  equation.  Let  be  the  zeroth  order  solution  of  Eq.(23)  with  A  =  0,  and 

<  Co(k,t)Co(q,<)  >=  6{k  +  q)k^Uo{k). 

By  discarding  terms  of  O(A^)  eind  putting  A  =  1,  we  obtsiin  after  some  straightfor¬ 
ward  algebra, 

=  -  r dr  <  J/„(0)S,6(-r)  >  exp|-2,i(k)7-]i<,fc.  j-[»:*c;„(k)l,  (26) 

provided  that  the  term  second  order  in  Sabkaid/dki,)  is  negligible. 

A  specific  model  of  the  correlation  between  U  and  S  in  Eq.(26)  may  be  obtained 
by  assuming 


Ua{t)  ~  ^  Ua{q,t),  Sah{t)  ~  9l>«a(q,f), 

q<K  q<K 

<  Uc,(t)Sab(s)  >=  5^  <  Ma(q.096«a(-q,'S)  >, 

q<K 


with 


(27) 
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where  K  ^  k.  Without  loss  of  generality,  we  may  put 

<  Wa(q,0)tia(-q, -r)  >=  i>ofl(q)t^(q)exp[-.^(q,r)],  (28) 

in  which  <t>  =  4r  +  i<i>i  is  a  function  of  q  and  t,  and  the  factor  Dao(q)  =  ^aa  -  QaQa 
ensures  the  incompressibility  condition  of  the  velocity  field. 

Substituting  Eq.(27)  with  Eq.(28)  into  Eq.(26)  gives 

A«(k)  =  =  -Yi  ta«('‘.‘i)|k  X  qpt'(q)(q  •  Vk)HV(k)i/c?(k).  (29) 

to  the  lowest  order  in  A,  where 


d(k,q)=  r 
^0 


exp{-2/x(k)T  -  (^(q,  T)]dT, 


and  we  have  used  Daai<i)kaka  =  |k  X  qp.  The  right-hand  side  of  Eq.(29)  multiplied 
by  U(k)  is  of  the  same  form  as  the  imaginary  part  of  Eq.(17)  except  that  Im^(k,  p,  q) 
is  replaced  by  Imfl(k,  q)  in  Eq.(29). 

If  we  choose  ^(q,  r)  =  [/x(q)  4-  ia;(q)]r,  then  Eq.(30)  may  be  written  as 

q)  -  ^  .  (31) 

If  w(q)  ~  wo(q)  and  ^(k)  >  /i(q),  |u>o(q)|  for  k'^  K  >  q,  then 


By  choosing  fj,  as 


Im0(k,q)  ~  -7(k)wo(q),  7(k)  = 


/i2(k) 


4/i2(k)’ 


(i.e.,  7(k)  =  2f{3Z)  as  in  (20)),  and  retracing  the  derivation  of  (21)  from  (18), 
we  can  recover  Eki.(21)  from  Eq.(29)  under  the  isotropic  assumption  (II).  When 
U{k)  ~  A:""*,  Eq.(29)  becomes  identical  to  Eq.(21). 

The  above  model  suggests  that  the  correlation  <  Ua{0)Sab{T)  >  between  the 
random  sweeping  velocity  and  strain  of  large  eddies  may  yield  the  systematic  west¬ 
ward  frequency  shifts  of  small  eddies.  Equation  (26)  shows  that  the  frequency  shifts 
are  smaller  for  larger  damping  factor  fi{k)  of  small  eddies.  The  small  eddies  have 
a  characteristic  life  time  of  order  l/^(k)  associated  with  the  damping  factor  in 
Eq.(23).  Equation  (32)  or  (11)  with  (15a)  suggests  that  the  life  time  is  shorter  for 
larger  total  vorticity  Z  under  certain  conditions.  This  results  in  smaller  frequency 
shifts  for  larger  Z.  The  result  Eq.(29)  with  Eq.(31)  shows  that  the  increase  of 
frequency  u;(q)  of  large  eddies  yields  Izurger  frequency  shifts  of  small  eddies  when 
|u;(q)|  <  2^(k),  but  the  frequency  shifts  decrease  with  the  increase  of  u;(q)  in  the 
opposite  limit  |a;(q)|  >  2/i(k). 
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OTHER  QUANTITIES 


A)  Frequency  Shifts  of  Eulerian  Response  Function 

In  this  paper  we  have  considered  the  frequency  shift  Au  of  the  Eulerian  two-time 
correlation  function  U.  It  might  be  tempting  to  relate  the  shift  with  that  of  the 
Eulerian  response  function  G  (or  the  so-called  Eulerian  renormalized  propagator), 
which  may  be  defined,  corresponding  to  our  use  of  Eq.(l),  as 

G(k,  t,  s)S(k  -h  q)  =<  G(k,  q,  t,  s)  >, 


where  G  is  defined  as 

sc(k,t)  =  dsG{k,q,t,s)6f{q,s), 

in  which  6f  is  an  infinitesimal  disturbance  added  to  the  right-hand-side  of  Eq.(l) 
and  6^  is  the  response  to  the  disturbance,  and  G  obeys 

+  up  -I-  tu;o(k)]G(k,k',t,s)  =  -  qyPx)i^  -  -7)C(P.0<^(q»k',<,s), 

(33a) 

G{k,k\t,t)  =  1.  (336) 

Because  G  is  deterministic  at  <  =  s  and  satisfies  Eq.(33b)  and  <  C  >=  Eq.(33a) 

gives 

[— -f  -I- ici;o(k)]G(k,k',<,s)  =  0,  at  t  =  s. 
eft 

Unlike  to  the  frequency  shift  Aw,  there  is  therefore  no  contribution  from  the  nonlin¬ 
ear  interactions  to  Awg,  where  Aw©  is  defined  similarly  to  Eq.(4a)  with  T  replaced 
by  the  average  of  the  right-hand  side  of  Eq.(33a).  Thus  it  is  wrong  to  assiune  the 
so-called  fluctuation-dissipation  approximation 

U{k,t,s)  =  U{k)G{k,t,s), 

as  far  as  the  shift  Awe  is  concerned,  and  the  shift  Aw  of  Eulerian  two-time  corre¬ 
lation  should  not  be  confused  with  the  shift  Awg  of  the  response  function. 
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B)  Frequency  Shifts  of  Lagrangian  Correlation  Function 

Another  quantity  which  might  be  related  to  the  shift  Au;  is  the  shift  of  La¬ 
grangian  correlation.  Let  T,t)  be  the  Fourier  transform  with  respect  r  of 

the  Lagrangian  two-time  velocity  correlation  <  v(x  +  r,<;r)  •  v(x,  <;<)  >,  where 
v(x,<;  r)  is  the  velocity  at  time  t  of  the  fluid  particle  that  was  at  x  at  time  t. 
Because 

d  d 

—  <  v(x-|-r,t-,T)  ■  v(x,<;<)  >=<  (~v(x -|- r,f;  t)]  •  u(x,<)  >, 

and 

a 

dt 

it  is  shown  that 

[^  +  -i-  ia;o(k)]£/'L(k,r,t)  =  0,  at  r  =  <,  (34) 

where  we  have  used  <  Vp  •  u  >=  0  in  homogeneous  turbulence,  in  which  p  is 
the  pressure.  Unlike  the  Euler ian  spectrum  U  in  Eq.(3),  there  is  therefore  no 
contribution  from  the  nonlinear  interactions  to  the  r—  derivative  at  r  =  <  of  the 
Lagrangian  spectrum  U l-  Thus  the  frequency  shift  Au;  should  not  be  confused  with 
that  of  Lagrangian  correlation  Ui. 

In  the  LRA,  Ul  '\s  given  by  17£,(k,r,<)  =  G(k,  r,t)I7(k,<)  and  the  LRA  with 
Eq.(7)  is  consistent  with  Eq.(34).  Because 

{dldt)U{V,t)  =  (a/ar)f/i(k,r,t)-|-(a/aT)f/^(-k,r,0,  at  r  =  t, 

and  is  real,  Eq.(34)  also  implies  that  there  is  neither  contribution  from 

the  nonlinear  interactions  to  Im(a/aT)C/^^(k,T,<)  at  t  =  f,  where  is  the 

Fourier  transform  of  <  v(x4-r,  r;T)  •v(x,  r;<)  >  and  {d/dT)U is  the  key 
quantity  in  the  Abridged  Lagrangiem  History  Direct  Interaction  Approximation  by 
Kraichnan  (1965). 

C)  Frequency  Shifts  in  Inviscid  Truncated  System 

The  inviscid  truncated  model  of  Eq.(l)  with  a  retziined  wavevector  domain  D 
h2is  zm  equilibrium  state  characterized  by  the  equilibrium  energy  spectmm 


(u(x,t)  •  V)]u(x,t)  =  —  Vp  —  (  terms  linear  in  u). 


— v(x,t ;  T)l^=t  =  [ 
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where  a  and  b  are  constants  (Salmon  et  al.,  1976).  Since  Eq.(5)  gives 


T{k)  1 
t72(k)  2 


=  5  E 


Uik)  U(p)  ^  ^  U(p)  J 


p.qeo 


(a  simileir  expression  has  been  derived  by  Camevale  et  al.,  1981),  it  is  shown  that  if 
U  is  given  by  the  equilibrium  spectrum  and  if  the  triple  relaxation  factor  satisfies 
the  synunetry  0(— k, p, q)  =  d(~k,q, p)  between  p  and  q,  then  T(k)  is  identically 
zero,  i.e.,  not  only  the  real  but  also  the  imaginary  part  of  T(k)  are  zero.  The  triple 
relaxation  factor  of  the  LRA  given  by  Eq.(6)  in  fact  satisfies  the  symmetry,  and  the 
LRA  therefore  yields  Au;(k)  =  0  at  the  inviscid  equilibrium  state. 


D)  Complex  Eddy  Viscosity 

There  are  various  ways  to  define  eddy  viscosity.  Following  Kraichnan  (1976),  we 
consider  here  the  following  definition  of  the  eddy  viscosity  ut', 

UT(m)  s  -T'>(k|7f)/[*"t;(k)), 

where  r^(k|A’)  is  the  contribution  to  T(k)  from  the  interactions  among  the  modes 
(k,  p,  q)  with  p  or  q  >  K.  By  assuming  f/(q)  U{k)  for  g  >  fc,  and  noting  that 
Eq.(5)  gives 


T>(k|R’)  1 

Uik)  ~  2 


9  51  ^(-k,p,q) 

q>K 


Ip  X  qp 


iq^-p‘^){\p^Uip)-q^Uiq)]+k^[U{q)-U{p)]l 


for  k  K,vfe  obteiin 


Mklif)  =  E  «(-k,  P,q)(k  X  q)*^^^(k  •  V,)|g^t/(q)l, 


ior  k  4^  K,  where  the  triple  relaxation  factor  may  be  approximated  as 

0(-k,p,q)  ~  d(-k,q,q)  =  f  exp[-2^R(q,r)  -  ^H(-k,r)  -  i<^/(-k,r)]dr, 

Jo 

provided  that  <j>ip)  <f>{—q)  for  p  =  k  —  q  and  k  q  p. 

If  we  suppose  U(q)  ~  Uiq)  and 


•l>i{k,  t)  ~  a(fc)a;o(k)T,  |^/(k,  r)|  <  ^^(q,  r). 
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for  r  =  0(rfl(k))  in  Eq.(35),  then 

Q'(A:)u;o(k)  ^  'r{<i)d[q^Uiq)] 


Imi/(k(fc£,)  = 


E 


8  q  dq  ’ 

where  'rfl(k)  is  the  characteristic  time  sczJe  of  <f>R(k,T),  and 

7(q)=/  Texp{-^(q,r)]dr. 

Jo 

Thus  the  imaginary  part  of  the  viscosity  i/y  may  be  nonzero. 


CONCLUSION 

The  results  obtained  in  the  present  paper  may  be  summzirized  as  follows. 

I] .  The  DNS  and  the  LRA  agree  in  the  following  points: 

(1)  the  shifts  are  westward,  i.e.,  Au  >  0  for  kg  >  0, 

(2)  the  shifts  are  nearly  proportional  to  kg, 

(3)  the  shifts  increase  with 

and  (4)  the  shifts  increase  with  EfZ,  but  are  independent  of  either  amplitude  under 
certain  conditions. 

II] .  The  above  properties  may  be  explained  by  a  model  that  includes 

(1)  oscillating  random  sweeping  and  strain  of  large  eddies, 
and  (2)  eddy-damping  of  small  eddies. 

These  are  represented  by  the  V-  and  ft-  terms  in  the  model  (23).  The  LRA  as 
well  as  the  model  suggests  that  the  shifts  may  occur  even  if  the  energy  spectrum  is 
nearly  isotropic. 

III] .  The  time  dependence  of  Euleri2m  correlation  should  not  be  confused  with  those 
of  Eulerian  response  function  and/or  Lagrangian  correlation.  It  is  wrong  to  zissmne 
the  fluctuation-dissipation  relation  for  Eulerian  correlation.  An  analysis  of  the 
nonloczd  interactions  suggests  that  eddy  viscosity  may  be  complex. 

The  present  paper  treats  only  cases  of  small  P,  £ind  the  effects  of  high  and 
strong  anisotropy  are  remained  to  be  studied.  The  role  of  coherent  structure,  which 
was  not  taken  into  account  in  the  theory,  remeiins  an  open  question. 
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ABSTRACT 

This  paper  examines  the  formulation,  application  and  utility  of  certain  ideas  from 
equilibrium  statistical  mechanics  to  physical  oceanography.  In  particular  we  discuss  the 
connection  of  these  ideas  to  selective  decay  (minimum  enstrophy)  theories  and  to  the 
production  of  nonlinearly  stable  states,  as  well  as  its  limitations  when  dealing  with  forced- 
dissipative  flows.  A  robust  prediction  of  the  theories  discussed  is  the  generation  of  mean 
flows  around  topography,  which  should  be  amenable  to  observational  verification  or 
falsification. 

1.  INTRODUCTION 

The  goal  of  tins  paper  is  to  try  to  put  into  an  oceanic  framework  various  concepts  from 
the  fields  of  equilibrium  statistical  mechanics  and  from  turbulence,  and  to  attempt  to 
understand  their  importance  and  relevance,  if  any,  to  the  circulation  of  the  world's  oceans. 
To  these  ends  we  first  briefly  summarize  the  theory  of  equilibrium  statistical  mechanics  as 
applied  to  geophysical  fluids,  determining  the  conditions  under  which  it  applies,  and,  in 
those  conditions,  what  the  predictions  of  the  theory  are.  Then,  we  discuss  whether 
numerical  simulations  of  the  equations  of  motion  do  in  fact  give  rise  to  the  predicted 
(maximum  entropy)  solutions  under  the  appropriate  conditions. 

The  statistical  equlibrium  has  often  been  compared  to  an  alternative  theory,  the  ‘selective 
decay’  or  ‘minimum  enstrophy’  hypothesis,  which  predicts  evolution  toward  the 
nonlinearly  stable  ‘minimum  enstrophy’  state,  and  we  shall  discuss  this  connection.  Both 
of  these  theories  are  in  a  sense  equilibrium  theories;  the  statistical  mechanics  applies  to 
time  or  ensemble  averages,  and  the  selective  decay  theory  predicts  the  end-state  of  a 
weakly  decaying  sysem.  Neither  can  describe  certain  disequilibrium  phenomena  arising  in 
forced-dissipative  situations.  We  shall  show  that  certain  important  phenomena,  such  as  the 
formation  of  jets  on  a  beta-plane,  cannot  in  fact  be  described  by  such  theories.  Finally,  we 
discuss  the  application  of  both  equilibrium  and  non-equilibrium  theories  to  real  oceanic 
flows,  and  briefly  discuss  where  their  predictions  could  be  observationally  tested. 
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2  THEORETICAL  FORMULATION 

The  simplest  model  system  with  which  to  fix  ideas  is  the  barotropic  vorticity  equation,  to 
wit: 


^+(uV)C=0  (2.0) 

dt 

where  u  is  the  two-dimensional  velocity  and  ^  is  the  vorticity.  In  terms  of  a  stream 
function,  u  =  V  x  and  ^  =  V  V  If  the  domain  is  homogeneous,  there  is  no  mean  flow, 
and  energy  and  enstrophy  are  both  conserved.  In  fact,  any  integral  function  of  the  vorticity 
is  conserved.  To  see  this,  note  that  the  equation  states  that  the  evolution  consists  merely 
of  a  continuous  re-arrangement  of  the  vorticity,  which  is  conserved  on  each  parcel. 
Similarly,  any  function  of  vorticity  is  conserved  on  parcels.  Thus,  an  integral  over  the 
domain  of  any  function  of  the  vorticity  is  preserved,  since  the  integration  is  indifferent  to 
the  location  of  the  parcels  themselves.  The  quadratic  invariants,  energy  E,  and  enstrophy 
Zj  are  given  by 


Circulation, 


£  =  —  f  U'unbr 

2  Is 

(2.1) 

(2.2) 

(2.3) 

is  also  conserved.  Of  all  the  integral  invariants,  these  three  have  assumed  a  special 
importance  in  the  equilibrium  theory,  as  discussed  further  in  section  4. 

The  inviscid  equation  of  motion  may  be  written  in  the  form 

(2.4) 

«»  kpq 

where  is  the  spectral  coefficient  of  the  k*  wavevector,  and  the  geometric  interaction 
coefficients  are  zero  unless  the  wavevectors  form  a  triad  k  +  p  +  q  =  0.  Also,  if  two 
of  the  three  members  are  equal  =  0.  These  conditions  lead  to 
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This  important  property  means  the  system  is  Louivillean  (in  fact  the  system  satisfies  the 
detailed  Louivillean  property).  In  the  phase  space  of  the  spectral  coefficients  the  motion  is 
therefore  incompressible.  If  an  ensemble  of  system  states  is  represented  by  a  cloud  in  the 
phase  space,  the  cloud  preserves  its  volume.  Statistical  mechanics,  in  particular  the  notion 
that  the  properties  of  a  system  will  be  given  by  the  maximum  entropy  state,  may  then  be 
applied.  There  are  now  two  ways  to  proceed.  In  the  first — perhaps  the  more 
conventional — one  assumes  that  a  single  system  will  explore  all  accessible  phase  space 
with  equal  likelihood — this  is  the  ergodic  hypothesis  Alternatively,  one  may  assume, 
without  reference  to  the  ergodic  hypothesis,  that  the  least  biased  assumption  one  can  make 
about  the  averaged  properties  of  a  system  is  that  its  time  average  state  is  given  by  the 
maximum  likelihood  state.  This  is  the  information  theory  approach  (Jaynes  1979). 
Although  the  underlying  philosophy  is  different,  either  way  one  must  compute  the 
maximum  entropy  state.  The  difference  in  these  two  attitudes  far  transcends  applications 
in  geophysical  fluid  dynamics,  i'  joes  to  the  heart  of  statistical  mechanics,  and  we  will  not 
discuss  in  any  detail  the  differences  here.  The  information  theory  approach  requires  no 
assumptions  about  the  behaviour  of  a  system;  it  merely  says  “this  is  what  we  know  about  a 
system,  and  this  is  what  we  can  predict  without  implicitly  making  additional  assumptions.  ’ 
No  ‘mixing’  hypotheses,  for  example,  are  required.  It  is  essentially  a  Bayesian  approach 
(see  e  g..  Gull  1991,  and  other  articles  in  Buck  and  Macaulay  1991).  The  prior  constraints 
are  the  known  invariants:  in  principle  any  invariant  or  other  constraint  could  be  built  in. 
Aside  from  these  constraints  equal  probability  is  assigned  to  each  micro-state — a  principle 
of  least  bias,  which  says  nothing  about  how  a  system  may  actually  behave.  In  spite  of  this 
seemingly  rational  basis,  many  physicists  are  uncomfortable  with  the  information  theory 
approach,  since  for  any  given  system  it  offers  little  assurance  that  its  predictions  will  be  of 
any  use  whatsoever,  and  it  seems  to  divorce  the  predictions  one  makes  of  a  system  from 
its  physics.  The  ergodic  hypothesis,  on  the  other  hand,  makes  the  explicit  physical 
assumption  that  a  system  will  explore  all  regions  of  phase  space  available  to  it,  constrained 
by  the  global  integral  invariants.  However,  it  is  generally  extremely  difficult  to  rigorously 
prove  that  a  particular  system  is  ergodic,  and  for  most  systems  it  remains  an  assumption. 

In  either  case,  the  time  or  ensemble  averaged  state  is  given  by  the  maximum  entropy  state. 
The  problem  is  thus  to  maximize  the  entropy, 

^  =  (2.6) 
1 

where  p,  is  the  probability  of  the  system  being  in  the  i*’’  microstate,  subject  to  the  inviscid 
constraints.  Assuming  for  the  moment  that  only  the  quadratic  constraints  (2.1)  and  (2.2) 
and  the  circulation  (2.3),  are  relevant,  the  system  will  satisfy  a  Gibbs  distribution 


p-  =exp-(aE  +  yZ, +  5Z,) 


(2.7) 
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where  the  parameters  a,  y  and  5  are  Lagrange  multipliers.  (See  e  g.  Tolman  1938.  See 
Holloway  1986  for  a  review  of  many  applications  of  statistical  methods  to  geophysical 
fluid  mechanics.)  Fairly  standard  methods  can  then  be  used  to  make  predictions  of  the 
mean  flow  and  its  variance  (e  g.  Kraichnan  1975,  Salmon,  Holloway  and  Hendershott 
1976).  The  spectrum  of  eddy  kinetic  energy  of  the  maximum  entropy  state  is  given  by 
(Kraichnan  1975): 


E(k)  = 


nk 

aiH+k^) 


(2.8) 


In  a  homogeneous  environment  (for  example  a  doubly  periodic  flow  with  no  topography) 
there  is  no  mean  flow.  However,  the  presence  of  topography,  or  of  boundaries,  will  in 
general  produce  a  mean  flow  The  equation  of  motion  is  then 

^+J(i}f,q)  =  0  (2.9) 

where  q  =  W^\if+h{x,y)+Py.  The  beta  effect  appears  in  formally  the  same  way  as 
topography.  In  a  closed  domain  with  boundary  conditions  of  no  normal  flow  (or  in  a 
channel  geometry)  the  enstrophy  constraint  (2.2)  is  replaced  by  the  condition  that  potential 
enstrophy  is  conserved,  where 


Q,=j/dx.  (2.10) 

The  steady  component  of  the  maximum  entropy  flow  is  then  given  by  the  linear 
relationship. 


(2.11a) 


which  gives  the  Helmholtz  equation 

(m-V^)<  Vr>=^y+/i(jr,y)-A.  (2.11b) 

The  values  of  A,  /x,  and  a  are  determined  implicitly  by  the  values  of  the  energy,  enstrophy 
and  circulation.  In  a  doubly  periodic  domain  the  |3-efFect  plays  no  direct  role.  This  is 
because  in  such  a  domain  the  inviscid  invariants  do  not  depend  on  beta;  Zj  remains 
invariant,  and  there  is  no  mean  flow,  as  demanded  also  by  homogeneity.  This  is  however, 
rather  a  special  case  because  Zj  is  not  a  Casimir:  its  conservation  depends  on  the  special 
relationship  between  ^  and  i/r  .  In  a  closed  domain,  or  in  a  zonal  channel,  the  Casimir  is 
conserved,  and  J3  will  in  general  affect  the  mean  flow. 
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3  EXPERIMENTAL  VERIFICATION 

We  now  exanune  for  a  few  cases  whether  a  single  system  does  indeed  evolve  into  a 
maximum  entropy  state.  A  strict  information  theorist  might  argue  that  it  is  irrelevant  to  the 
theory  whether  or  not  a  system  is  ergodic;  however,  ergodicity  is  an  interesting  property  in 
its  own  right,  regardless  of  its  role  in  the  foundations  of  statistical  mechanics.  Most 
numerical  models  conserve  only  the  quadratic  invariants,  plus  circulation,  and  are  not 
guaranteed  to  respect  the  higher  order  invariants.  Thus,  one  aspect  that  will  be  of  interest 
is  whether  a  numerical  simulation  will  evolve  into  a  state  governed  by  only  by  the 
quadratic  invariants,  or  whether  higher  order  invariants  may  nevertheless  somehow  play  a 
role. 

The  equilibrium  spectrum  of  inviscid  two-dimensional  fluids  in  doubly  periodic  domains 
has  been  demonstrated  by  Camevale  (1982)  and  Camevale  and  Vallis  (1984).  Using  a  de- 
aliased  spectral  code  which  exactly  conserves  energy  and  enstrophy — such  a  model  may 
be  termed  ‘quadratically  inviscid’ — then  for  a  long  enough  time  average  the  energy 
spectrum  is  found  to  be  that  of  (2.8),  and  illustrated  in  Figure  1 . 

Figure  1.  Predicted  equilibrium  energy 
spectrum  (a),  and  enstrophy  spectrum  (b),  in  a 
spectrally  truncated  inviscid  model,  for  two 
values  of  the  truncation  wavenumber  kn^,^. 


In  a  closed  domain,  the  mean  flows  of  the 
maximum  entropy  states  are  known  as 
Fofonoff  flows.  FofonofF(1954  and  1962) 
studied  the  analytical  solution  of  equation 
(2. 1 1)  in  a  square  basin  with  no-norma' 
flow  boundary  condition.  Wang  and  Vallis 
(1993)  integrated  (2.9)  in  a  closed  domain 
with  a  quadratically  inviscid  numerical 
model.  A  typical  (analytic)  Fofonoff 
solution,  for  positive  p,  is  shown  in  Figure 
2.  The  absolute  vorticity  field  is  parallel  to 
the  streamfunction  field,  as  required  by  the 
linear  relationship.  The  relative  vorticity  is 
confined  to  the  boundary  layer,  the 
thickness  of  which,  /,  is  given  by 

Vm 


(3.0) 
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where  /?„  is  the  j3-plane  Rossby  number,  defined  as  /?„  s  where  is  the  root 

mean  square  velocity,  and  L  is  the  basin  size.  The  boundary  layer  gets  thinner  as  p 
increases,  or  total  energy  decreases.  The  absolute  vorticity  field  is  dominated  by  the  the 
planetary  vorticity  Py  inside  the  basin,  where  the  flow  is  westward,  the  flow  returns  to  the 
eastern  boundary  through  northern  and  southern  boundary  layers,  forming  two  gyres, 
anticyclonic  in  the  northern  basin,  cyclonic  in  southern  basin.  The  parameter  X  affects  the 
symmetry  of  the  fields;  in  a  domain  stretching  from  y=—L  to  +L  then,  for  zero  X,  the 
fields  are  symmetrical  about  >'  =  0;  for  general  non-zero  X,  one  gyre  will  be  enlarged,  while 
the  other  will  be  squeezed,  and  in  the  extreme  case  one  gyre  can  fill  out  the  whole  basin. 

Just  as  for  the  spectral  doubly  periodic  case,  the  inviscid  simulations  do  show  an  approach 
to  a  maximum  entropy  state.  ^  Figure  3  shows  the  resulting  time  averaged  streamfunction 
and  Figure  4  shows  a  scatter  plot  of  streamfunction  versus  potential  vorticity.  Many  other 


(left)  Figure  2.  Analytic  Fofonoff 
solution  in  a  closed  domain  Ox  =  40, 

^  =  10  and  X  =  0.  (a)  Relative  vorticity. 
(b)  Potential  vorticity.  (c) 

Streamfunction. 

(center)  Figure  3.  As  for  Figure  2,  but 
the  results  are  now  obtained  as  the  time 
average  of  an  energy  and  enstrophy 
conserving  numerical  simulation. 

(above)  Figure  4.  Scatter  plot  of  ^  -  v', 
corresponding  to  the  solution  in  Figure  3. 


^  Strictly,  this  and  other  numerical  demonstrations  in  this  article  are  not  rigorous  demonstrations 
of  eigodicity.  They  merely  demonstrate  that  the  system  reaches  a  macro-state  similar  to  that  of  the 
maximum  entropy  state. 
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simulations  and  details  are  presented  in  the  paper  by  Wang  and  Vallis  (1993).  They 
consider  different  shaped  domains,  various  parameter  values  and  cases  with  topography  It 
appears  that,  in  general,  inviscid  integrations  do  indeed  evolve  in  such  a  way  that  the  time 
averaged  flow  is  very  similar  to  the  maximum  entropy  state.  In  other  words,  the 
simulations  suggest  the  flow  is  ergodic. 

4  SELECTIVE  DECAY,  STABILITY  THEORY 
AND  HIGH  ORDER  INVARIANTS 

Minimum  enstrophy  states 

Related,  certainly  in  the  minds  of  many  oceanographers,  to  maximum  entropy  theories  are 
so-called  selective  decay  theories  (Bretherton  and  Haidvogel  1976).  Although  apparently 
not  based  on  such  a  fundamental  tenet  as  maximizing  entropy,  these  can  and  have  been 
applied  in  a  number  areas  of  physics,  such  as  magneto-hydrodynamics,  as  well  as  in 
geophysics.  (Other  variational  principles,  such  as  ‘minimum  energy  dissipation,’  exist. 
Montgomery  and  Phillips,  1990,  argue  that  minimum  energy  dissipation  is  a  consequence 
of  maximum  entropy,  and  we  shall  see  that  is  also  true  for  the  minimum  enstrophy 
principle.)  Consider  two-dimensional  or  quasi-geostrophic  flows.  Then,  in  any  turbulent 
situation  enstrophy  may  be  expected  to  be  dissipated  by  viscosity  at  a  much  faster  rate 
then  energy.  This  is  because  (in  the  classic  theory  of  two-dimensional  turbulence)  energy 
is  trapped  at  the  rela‘’vely  inviscid  large  scale  whereas  enstrophy  is  transferred  to  small 
scales  where  it  may  be  dissipated  by  viscosity.  Indeed,  in  the  limit  of  zero  viscosity, 
enstrophy  is  dissipated  whereas  energy  is  conserved.  (This  is  an  equilibrium  prediction, 
which  does  not  violate  the  regularity  results  that  enstrophy  dissipation  is  zero  if  viscosity  is 
zero.  See  Vallis  1985  and  1992.)  Thus,  the  end  state  of  decaying  system  may  be  expected 
to  be  close  to  a  minimum  enstrophy  state  for  a  given  energy.  Consider  arbitrary  variations 
satisfying  y  =  0  on  the  boundary  F,  minimizing  potential  enstrophy  for  given 
circulation  0,  and  energy  E.  We  require 

5j  ^iV^\i/+Pyfdxdy+p5^^^(yfdxdy-X5j^^^\lfdxdy  =  0.  (4.0) 


After  integrating  by  parts  this  yields 


\V^{V^w+By-n\if)5dxdy+  f  (V^i//+/}y-A)-— =  0 

Js  •'r  cm 

which  gives,  since  both  Syr  and  boundary  value  of  ddytldn  are  arbitrary. 


(4.1) 


(V^t/r+  py  -  //!fr)  =  0  within  5, 


(4.2) 
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and 

VV+A)'~^  =  0  on  r.  (4.3) 

Thus,  using  =  0  on  F,  we  obtain 

(p-V^)V^  =  ^y- A  everywhere.  (4.4) 

Hence,  minimization  of  potential  enstrophy  gives  the  same  linear  relationship  between 
absolute  vorticity  and  streamflinction  as  the  maximum  entropy  prediction,  although  we 
have  not  yet  shown  that  the  parameters  are  the  same.  But  in  fact,  in  the  limit  of  infinite 
resolution,  the  maximum  entropy  state  is  identical  to  the  minimum  potential  enstrophy 
state 

For  the  sake  of  discussion,  consider  flow  in  a  periodic  domain,  with  p=0.  Camevale  and 
Frederikson  (1987)  show  that  a  steady  flow  defined  by  the  form  (2. 1  la)  or  (4.4)  is  stable 
in  the  sense  of  Lyapunov,  in  that  the  maximum  amplitude  to  which  a  perturbation  may 
grow  is  bounded  by  its  initial  amplitude,  \f  n>kl  where  is  a  wavenumber  smaller  than 
the  smallest  wavenumber  of  the  topography.  (If  q'iyf)  is  positive  everywhere,  stability 
follows  immediately  by  Amol'd's  first  theorem;  Amol'd  1966.)  Stability  occurs  physically 
because  the  branch  of  solutions  with  >  k^  corresponds  to  a  mimimum  enstrophy  state. 

(A  state  of  maximum  enstrophy  for  a  given  energy  is  also  stable,  although  it  does  not 
correspond  to  a  physically  realizable  state  at  infinite  resolution.)  In  general,  any  physical 
state  which  corresponds  to  an  extremum  of  conserved  quantities  must  be  stable,  for  if  the 
system  is  perturbed  from  that  state,  it  must  remain  close  to  the  extremum  state.  Thus,  a 
minimum  enstrophy  state  for  a  given  energy  is  stable,  since  this  is  an  extremum  of  the 
conserved  quantitiy  Q2+HE  .\\\s  equivalent  to  maximum  energy  state  for  a  given 
enstrophy.  Now,  the  maximum  entropy  state  is  not  a  steady  state,  and  it  is  not  appropriate 
to  call  it  a  ‘stable’  state.  However,  in  the  limit  of  infinite  resolution,  it  can  be  shown  that 
the  statistical  mechanical  equilibrium  becomes  a  steady  Amol'd  stable  state,  identical  to  the 
minimum  enstrophy  state.  That  is,  the  eddy  energy  vanishes  at  all  finite  wavenumbers.  The 
proof  is  to  be  found  in  Camevale  and  Frederikson  (1987). 

Thus,  there  is  a  close  connection  between  the  maximum  entropy  and  minimum  enstrophy 
theories.  To  see  the  underlying  physical  connection,  ask  the  question  ‘why  is  enstrophy 
dissipated  faster  than  energy?’  An  answer  may  be  found  in  statistical  mechanics.  Consider 
an  inviscid  spectrally  tmncated  flow  with  no  topography,  with  energy  and  enstrophy 
localised  around  some  wavenumber,  and  suppose  that  the  turbulence  begins.  As  the 
system  increases  its  entropy,  it  evolves  toward  a  distribution  of  Figure  1;  energy  will  be 
confined  to  the  large  wavenumbers,  whereas  enstrophy  is  moved  to  higher  wavenumber. 
Now  imagine  increasing  the  cut-off  wavenumber.  The  energy  remains  trapped,  who'eas 
the  enstrophy  moves  to  higher  and  higher  wavenumber,  and  is  essentially  flushed  fi’om  the 
large  scales.  Indeed,  at  infinite  resolution,  all  the  enstrophy  is  at  large  wavenumber,  and  all 
the  energy  is  confined  to  the  small  wavenumber  (Knuchnan  197S).  Thus,  at  finite 
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wavenumbers  a  ‘minimum  enstrophy  state’  emerges  The  generalization  to  the 
topographic  state  is  straightforward,  and  at  infinitely  high  resolution  maximum  entropy 
and  minimum  enstrophy  are  identical.  There  is  no  eddy  (time- varying)  flow,  just  steady 
flow  locked  to  the  topography  by  (2. 1  lb).  The  flow  at  finite  wavenumber  is  steady  and 
nonlinearly  stable 

Higher  Order  Invariants 

As  mentioned,  the  continuous  equations  conserve  an  infinity  (an  uncountable  one!)  of 
integral  invariants.  Canonical  equilibrium  theory  based  on  the  conservation  of  energy  and 
enstrophy  has  but  one  mean  state,  p  <  y  >=<  q>  .  A  general  stationary  state  satisfies 


(4.5) 


where  F  is  an  arbitrary  differentiable  function.  If  F"{q)  is  strictly  positive,  i.e.  0  ^  < 
F"{q)  <C  <o°  then  nonlinear  stability  ensues  (Amol'd  1966).  Suppose  that  F  is  chosen  to 
satisfy  the  stability  criterion,  and  consider  a  system  close  to  (4.5).  Then,  the  system  cannot 
deviate  too  far  from  its  initial  state  If  F  is  chosen  to  be  a  nonlinear  function,  the  system 
certainly  cannot  be  expected  to  produce  time  average  statistics  which  satisfy  a  linear  q-\(f 
relationship.  By  the  time-reversibility  of  the  dynamics,  a  system  which  begins  its  evolution 
in  some  other  state  far  from  (4.5)  can  never  approach  that  state  too  closely.  Some  regions 
of  phase  space  are  forbidden  to  it,  and  ergodicity  on  the  energy-enstrophy  surface  will 
again  not  arise.  Shepherd  (1987)  has  explicitly  demonstrated  that  beta  plane  dynamics  are 
not  ergodic  on  the  energy-enstrophy  surface;  if  beta  is  sufficiently  strong  and  the  system  is 
initially  in  a  sufficiently  anisotropic  state,  then  the  system  will  remain  anisotropic,  because 
the  higher  order  invariants  prevent  the  system  from  ever  becoming  isotropic.  These 
results,  however,  are  not  criticisms  of  the  statistical  mechanical  method  per  se,  it  is  simply 
that  we  have  not  incoporated  all  the  known  constraints.  Since  potential  vorticity  is 
conserved  on  parcels,  arbitrary  integral  functions  of  potential  vorticty  are  invariant.  Thus, 
with  the  invariant 


H  =  E+G{q)  (4.6) 

where  E  is  the  energy  and  the  the  Lagrange  multiplier  is  aborbed  into  the  definition  of  the 
arbitrary  function  G,  the  appropriate  Gibbs  distribution  is 

(4.7) 


where  a  is  positive  for  normalizability.  Then,  in  a  manner  quite  analogous  to  that  which 
produced  (2.11)  we  obtain  the  flow. 
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Thus,  we  can  in  fact  regain  arbitrary  stationary  flows  from  the  statistical  mechanics, 
although  since  G  can  be  almost  any  function,  it  might  appear  that  the  statistical  mechanics 
is  really  a  quite  unhelpful  predictive  tool.  Furthermore,  note  that  (4.8)  differs  from 
<  y/  >=<  G(g)  >  unless  G*  is  a  linear  function,  so  that  the  equilibrium  state  cannot  be 
directly  calculated  unless  it  is  steady. 

The  flow  produced  by  (4.5)  will  not,  however,  be  necessarily  stable  in  a  truncated  finite 
difference  or  spectral  numerical  model,  unless  F'is  a  linear  function.  This  is  because  the 
stability  for  such  a  flow  requires  the  the  integral  of  F  to  be  conserved,  which  does  not  in 
general  hold  for  a  truncated  model.  The  role  of  such  higher  order  constraints  is  rather 
unclear  at  the  moment,  especially  in  the  light  of  interesting  results  by  Robert  and 
Sommeria  (1991)  which  purport  to  explain  the  prevalence  of  coherent  structures  in  two- 
dimensional  turbulence  via  the  use  of  a  statistical  mechanical  theory  that  formally 
maintains  all  the  invariants,  and  the  work  by  Miller  (1990).  It  does  appear,  though,  that 
the  stability  of  coherent  structures  (e  g.  modons)  may  rely  on  higher  order  invariants,  not 
captured  by  a  theory  which  only  preserves  the  quadratic  invariants. 

5  NON-EQUILIBRIUM  FLOWS  AND  JETS 

Non-equilibrium  simulations 

Although  strictly  inviscid  flows  have  been  observed  to  evolve  into  statistical  equilibrium, 
the  presence  of  viscosity  can  nevertheless  have  large  effects  in  preventing  the  realisation  of 
statistical  equilibrium.  Wang  and  Vallis  (1993)  considered  the  effects  of  viscosity  in 
modifying  Fofonoff  flows  (see  also  Griffa  and  Salmon  1989;  Cummins  1992).  They  found 
that  the  additional  boundary  conditions  that  a  viscous  solution  must  satisfy  are  responsible 
for  producing  time-averaged  states  different  from  Fofonoff  flows,  with  q-\^  relationships 
which  showed  strong  deviations  from  linearity.  For  example,  with  free-slip  boundary 
conditions,  the  potential  vorticity  is  constrained  to  the  boundary  value  ^y,  and  the  q  -  y 
scatter  plots  show  considerable  deviations  from  linearity  in  the  neighborhood  of  the 
boundary.  The  interior  flow  is  more  free  to  evolve  into  a  free  state  (really  a  minimum 
enstrophy  rather  than  maximum  entropy  state),  although  this  too  is  prevented  from 
complete  realization  by  potential  vorticity  homogenization  in  closed  gyres.  With  so-called 
‘super-slip’  boundary  conditions,  in  which  the  normal  derivative  of  vorticity  is  set  to  zero 
at  a  boundary,  the  boundary  layer  effects  are  reduced,  although  homogenization  still 
occurs. 

In  the  rest  of  this  section,  we  would  like  to  discuss  another  disequilibrium  phenomena,  the 
production  of  jets  in  the  presence  of  a  large-scale  potential  vorticity  gradient,  a 
phenomenon  not  captured  by  the  equlibrium  theories.  The  presence  of  a  beta-effect  (apart 
from  the  rather  special  homogeneous  geometry)  does  in  fact  produce  an  anisotropic  mean 
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flow  which  the  equilibrium  theories  can  capture  In  a  chaimel  geometry  the  equilibrium 
solution  will  be  a  zonal  flow,  given  by  the  solution  of  (2. 1  lb)  with  periodic  boundary 
conditions  in  the  x-direction.  Taking  =  0  at^^  =  -1  and>'  =  I  the  solution  is 


yfix,y) 


m1 

J 


(5.0) 


For  positive  |X  the  scale  of  this  purely  zonal  flow  is  the  scale  of  the  chaimel  (Fig.  5).  It  is 
the  ‘FofonofF channel  flow’  It  is  anisotropic.  (The  fact  that  the  equilibrium  doubly- 
periodic  beta-plane  flow  is  isotropic  is  a  slightly  artificial  result,  consequent  on  the 
homogeneous  geometry  .)  If  the  flow  is  required  to  be  symmetric  across  the  channel  then 
X  =  0.  Then,  the  mean  flow  is  only  non-zero  if  ^  ^  0.  If  /i  <  0  then  the  flow  may  produce 
jet-like  features.  However,  these  correspond  to  a  maximum  enstrophy  state  and  are  not 
necessarily  stable. 


Figure  5.  Fofonoff  flow  in  a  channel.  Shown  is 
the  zonal  flow,  namely  u  =  dy,  where  w  is 
given  by  (5.0)  with  X  and:  (a)  /i  =40,  p  =10, 
and  (b) /i  =-40,  )3  =10. 


In  a  forced-dissipative  turbulent  flow, 
another  mechanism  comes  into  play  which 
the  equilibrium  theory  does  not  capture, 
and  jets  may  be  produced  (Vallis  and 
Maltrud  1993).  Briefly,  the  mechanism  is 
as  follows.  The  frequency  associated  with 
a  Rossby  wave  is  /  k^,  whereas  the 
‘frequency’  associated  with  turbulent 
motion  is  more  like  Uk,  where  U  is  the  rms 
velocity  of  the  flow.  (Vallis  and  Maltrud 
discuss  other  possibilities  for  the  ‘turbulent  frequency,’  and  show  that  other  choices,  for 
example  C  '  where  C  is  the  mean  vorticity,  make  little  difference  to  the  following 
argument.)  If  the  Rossby  wave  frequency  is  much  higher  than  the  ‘turbulent  frequency,’ 
then  wave-like  motion  dominates  over  turbulent  motion.  However,  ir  will  be  very  difficult 
to  excite  such  Rossby  waves  for  that  same  reason — ^their  natural  frequency  is  much  higher 
than  that  of  the  forcing  turbulent  motion.  Now,  the  ‘turbulent  frequency’  is,  to  lowest 
order,  isotropic.  However,  the  Rossby  wave  freqency  is  most  decidedly  not.  Figure  6 
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shows  the  wave-turbulence  boundary:  within  the  the  dumb-bell  shape  the  Rossby  wave 
frequency  is  higher  than  the  turbulent  frequency,  and  energy  transfer  into  this  region  is 
inhibited.  Energy  cascading  to  larger  scales  ‘avoids’  the  modes  within  the  wave  region. 
The  cascade  to  large  scales  is  then  most  efficiently  achieved  by  the  excitation  oi zonal 
flow.  The  isotropic  cross-over  scale  between  waves  and  turbulence  is  given  by 


(5.1) 


Figure  6.  Wave-tuibulence 
boundary  in  k-space.  Plotted 
is  the  locus  of  points  whose 
Rossby  wave  frequency, 

Pk^  /  equals  a  ‘turbulent’ 
frequency  Uk.  Within  the 
‘dumb-bell’  the  frequency  of 
Rossby  waves  exceeds  that  of 
the  inverse  turbulence 
timescale. 


if  the  simple  expression  Wfc  is  used  for  the  turbulent  frequency.  (Other  expressions  are 
found  if  the  turbulent  frequency  is  parameterized  differently.)  The  scale  of  the  zonal  flow 
will  not  quantitatively  be  found  at  this  scale,  because  as  seen  in  Figure  6  Rossby  waves 
give  no  restriction  on  the  scale  of  the  zonal  motion,  because  for  the  zonal  flow  the  Rossby 
wave  frequency  vanishes.  However,  we  should  expect  the  scale  of  the  zonal  jets  to 
qualitatively  have  the  scale  kp,  since  the  cascade  to  larger  scales  will  be  very  inefficient 

once  the  energy  has  become  largely  zonal. 

This  robust  mechanism  seems  responsible  for  the  production  of  zonal  flow  in  forced- 
dissipative  beta-plane  simulations.  (A  related  but  slightly  different  mechanism  was  first 
proposed  by  Rhines  1975.)  However,  although  it  does  not  rely  on  dissipation  to  work,  it  is 
not  a  feature  of  the  inviscid  statistical  mechanical  simulations,  because  it  is  a  transient 
effect.  Although  it  is  more  ‘difficult’  to  initially  excite  modes  in  the  wave  regime,  in  time 
energy  will  nevertheless  creep  into  the  wave  region  and  remain.  However,  if  energy  is 
being  removed  at  low  wavenumbers,  by  viscosity  or  Ekman  friction,  then  a  constant  state 
of  anisotropy  can  be  maintained.  Thus,  in  forced  dissipative  flows,  jet-like  zonal  structures 
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timescale  on  which  a  statistical  mechanical  equilibrium  can  be  maintained.  The  equilibrium 
state  of  forced-dissipative  beta-plane  turbulence  is  zonal  flow,  whereas  the  equilibrium 
state  of  inviscid  beta-plane  turbulence,  in  a  homogeneous  domain,  is  isotropic  flow. 


Flow  over  topography  provides  a  very  similar  example  of  where  dissipative  flow  can  be 
different  from  the  inviscid  equilibrium,  or  from  the  minimum  enstrophy  state.  Again  two 
mechanisms  are  involved,  only  the  first  of  which  the  equilibrium  theory  is  able  to  capture. 
This  is  the  mechanism  which  first  generates  a  mean  flow  over  the  topography;  it  can  be 
interpreted  as  one  of  vortex  segregation.  Consider,  say,  a  single  hump  (a  ‘sea-mount’)  in 

an  eddy  field.  Fluid  moved  up  the  hump 
conserves  its  potential  vorticity,  so  its 


I  relative  vorticity  falls.  Similarly,  fluid 
moving  off  the  hill  into  a  valley 
increases  its  relative  vorticity.  The 
^  upshot  is  a  negative  correlation  (for 
positive  j)  between  topography  and 
vorticity  (or  in  general  anti-cyclones 
1 0  over  humps),  and  the  generation  of  a 
mean  flow.  (This  same  mechanism  is 
responsible  for  producing  the  zonal 
flow  on  a  beta-channel.)  Both  maximum 
j  entropy  and  minimum  enstrophy 

I  quantify  this  phenomena;  although 

j  neither  theory  is  aware  of  the 

conservation  of  potential  vorticity  on 
parcels,  utilization  of  the  quadratic 
invariants  gives  rise  to  a  linear 
j  relationship  between  streamfunction  and 
j  potential  vorticity,  as  in  (2. 1  lb). 
i  Inviscid  simulations  (Wang  and  Vallis 
1993)  indeed  show  that  a  maximum 
entropy  state  is  realized.  However,  the 
'  “  addition  of  viscosity  can  have  an 

important  effect.  For  topography 


Figure  7.  The  time  averaged  velocity  along  a  simple  comprising  a  single  ridge,  the  simple 
meridional  ridge  on  the  f-plane  for  various  topographic  prediction  is  of  flow  parallel  to  the  ridge 


heights  in  a  barotropic  forced-dissipative  simulation, 
forced  near  wavenumber  12.  The  velocity  is  plotted  as  a 
function  of  cross-slope  co-ordinate.  The  topography  is 
peaked  at  the  center  line,  witl|  no  long-slope  variation, 
(a)  The  amplitude  of  the  tt^graphy  is  /r  =  20;  the  flow 
has  approximately  the  same  scale  as  the  topography,  (b) 
h  =200;  jets  begin  to  appear,  superimposed  on  the  broad 


in  a  (.  seudo- westward  fashion  (that  is, 
facing  downstream;  higher  values  of 
potential  vorticity  are  to  the  right).  In  a 
forced-dissipative  situation  this 
prediction  is  not,  however,  always 
realized.  For  small  values  of  the 


background  flow,  (c)  A  =1000;  stronger  jets  appear.  topography,  flow  similar  to  that 
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prediction  is  realised.  However,  for  larger  topographic  heights,  jets  form  parallel  to  the 
topographic  contours  (Fig.  7).  Essentially,  as  far  as  the  flow  is  concerned,  the  topography 
seems  like  a  beta-effect,  and  for  the  same  reason  that  jets  form  on  a  beta-plane, 
topographic  jets  form  parallel  to  iso-lines  of  topography.  The  condition  for  the  appearance 
of  jets  is  that  the  jet  scale  be  smaller  than  the  cross  topography  scale,  where  the  jet  scale  is 
given  by  (5. 1),  but  with  variation  in  topographic  height  replacing  the  beta  effect  as  the 
cause  of  the  large  scale  potential  vorticity  gradient. 

7  DISCUSSION  AND  OCEANIC  RELEVANCE 

Statistical  mechanical  ideas  have  been  applied  in  two  general  areas  in  ocean  dynamics — ^to 
the  gyre  scale  quasi-horizontal  circulation,  and  to  flow  over  topography.  The  question  is 
whether  the  real  ocean  circulation  should  pay  attention  to  any  of  the  ideas  we  have 
discussed  herein.  Clearly,  the  ocean  is  neither  unforced  nor  inviscid,  and  we  cannot  expect 
it  to  quantitatively  reproduce  a  statistical  mechanical  equilibrium.  For  example,  the 
spectrum  of  eddy  kinetic  energy  is  more  likely  to  be  a  consequence  of  forced-dissipative 
geostrophic  turbulence  than  an  approach  to  inviscid  equilibrium.  If  such  ideas  have  any 
meaning,  then,  they  will  be  found  in  the  tendency  of  flow  toward  such  equilibrium, 
constrained  by  the  consequences  of  forcing  and  dissipation.  This  point  of  view  has  also 
been  taken  by  Holloway  (1992).  Thus,  the  minumum  enstrophy  state,  which  is  perhaps  a 
step  closer  to  a  forced-dissipative  reality,  can  be  seen  as  a  consequence  of  the  nonlinear 
dynamics  trying  to  evolve  into  a  maximum  entropy  state,  plus  the  effect  of  small-scale 
preferential  dissipation  of  enstrophy. 

The  minimum  enstrophy  and  maximum  entropy  states  are  ‘free’  solutions  of  the  equations 
of  motion,  and  so  a  tendency  toward  these  states  is  a  tendency  toward  free  solutions.  The 
notion  of  large  scale  quasi-horizontal  circulation  as  a  free  solution  (a  Fofonoff  state)  is 
rather  opposed  to  the  forced-dissipative  Stommel-like  models.  The  reconciliation  of  these 
viewpoints  is  not  obvious — or  indeed  if ‘reconciliation’  is  the  correct  attitude — for  the 
forced-dissipative  viewpoint  alone  is  simple  and  appealing.  Yet  to  the  extent  that  free 
nonlinear  evolution  is  possible  the  system  will  attempt  to  evolve  toward  a  free  solution. 
Minimum  enstrophy  (or  maximum  entropy)  states  are  stable  solutions  of  the  equations  of 
motion.  Are  they  attractors?  The  inviscid  equations,  being  Hamiltonian,  have  no  attractors 
and  it  is  not  correct  to  call  the  maximum  entropy  state  an  attractor.  However,  the  related 
minimum  enstrophy  state  is  an  attracting  state,  in  the  absence  of  forcing.  The  competitive 
roles  of  forcing  and  free  evolution  then  will  together  determine  the  (statistically)  steady 
state  ultimately  achieved. 

The  situation  where  free  evolution  is  likely  to  be  most  apparent,  and  equilibrium  solutions 
actually  manifest  themselves,  is  probably  in  mesoscale  phenomena.  Here  the  free  evolution 
of  turbulent  motion  has  more  rein  to  determine  the  mean  flow.  The  production  of  mean 
flows  around  topographic  features  would  be  a  direct  consequence  of  such  free  evolution. 
The  quasi-passive  free  advection  of  vorticity  over  topography  will  lead  to  a  negative 
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correlation  between  vorticity  and  topography,  that  is  anti-cyclonic  motion  over  hills  and 
cyclones  over  valleys.  The  mean  flow  is  pseudo-westward,  that  is,  facing  downstream, 
higher  values  of  potential  vorticity  are  on  the  right.  Over  ridges  or  continental  slopes  this 
results  in  polewards  (equatorwards)  mean  flows  on  the  western  (eastern)  sides  of 
meridional  ridges.  One  may  conjecture  that  this  is  the  cause  of  the  almost  ubiquitous 
polewards  undercurrents  in  eastern  boundary  currents.  Since  such  mean  flows  are 
generated  by  the  interaction  of  topography  and  mesoscale  eddies,  the  strength  of  the  mean 
flow  will  directly  depend  on  the  strength  of  the  eddy  field.  The  eddy  field  must  exist 
independently  of  the  topography.  It  may  be  a  result  of  baroclinic  instability  of  a  large  scale 
flow  giving  rise  to  a  sea  of  mesoscale  eddies.  However,  if  the  eddies  are  themselves 
produced  by  an  instability  involving  a  large-scale  mean  flow  and  the  topography,  the  phase 
relationships  between  the  eddies  and  the  topography  may  be  quite  different,  and  the  eddies 
will  not  be  passively  advected  over  the  topography. 

If  the  topography  is  sufficiently  steep  then  a  second  effect  becomes  noticeable,  namely  the 
concentration  of  the  mean  flow  into  narrow  currents,  via  the  topographic  beta-effect — just 
as  the  more  familiar  beta-effect  due  to  differential  rotation  produces  zonal  jets.  The 
criterion  to  see  such  an  effect  is  that  the  topography  is  sufficiently  steep  and  sufficiently 
broad  that  the  width  of  the  topographic  jets  is  narrower  than  the  cross-slope  scale  of  the 
topography.  Possible  locations  for  such  p'lenomena  are  on  continental  slopes  and  mid¬ 
ocean  ridges,  although  the  criterion  for  multiple  jets  may  never  be  actually  satisfied  in  the 
ocean.  Multiple  jets  do  of  course  exist  in  Jupiter’s  atmosphere,  and  it  has  sometimes  been 
suggested  that  the  earth's  atmosphere  verges  on  having  two  jet-streams,  rather  than  one 

Observational  testing  of  these  ideas  is  possible.  One  such  test  would  be  to  demonstrate  the 
unambiguous  existence  of  mean  currents  flowing  more-or-less  parallel  to  the  topography 
on  mid-ocean  ridges  or  around  mid-ocean  sea-mounts.  The  production  of  mean  flows 
along  continental  borderlands  is  also  predicted  by  the  theory,  and  here  the  sense  of  the 
mean  flow  is  to  produce  polewards  flowing  undercurrents  along  eastern  edge  of  ocean 
basins  and  equatorward  flowing  currents  along  the  western  edge.  These  are  counter  to  the 
mean  surface  flow  of  the  large  scale  gyre  structure  and  may  be  the  cause  of  the  ubiquitous 
counter  currents,  especially  the  polewards  counter  currents  seen  on  the  eastern  edge  of  a 
number  of  ocean  basins  (Neshyba  et  al.  1989).  However,  there  are  other  theories  for  that 
phenomena  which  do  not  rely  on  the  topography  but  on  the  wind-stress  and  ageostrophic 
phenomena  (MacCreary  1981),  and  the  situation  is  not  definitively  resolved.  If  currents 
can  be  observed  around  seamounts  where  there  is  little  systematic  wind-forcing  then  it 
would  be  hard  to  avoid  a  theory  involving  eddy-topographic  interactions,  such  as  those 
described  here.  A  prediction  of  the  theory  is  that  the  strength  of  the  mean  flow  is 
correlated  with  the  strength  of  the  eddy  field,  and  this  may  be  amenable  to  direct 
verification.  Finally,  a  practically  useful  aspect  of  statistical  mechanical  concepts  may  lie  in 
their  use  in  subgrid-scale  representation,  and  the  interested  reader  is  referred  to  the 
chapter  by  Holloway  in  this  volume. 
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ABSTRACT 

Because  oceans  are  bigger  than  the  computers  that  model  them,  most  of  what  goes  on  in 
oceans  cannot  be  represented  adequately.  Ability  to  observe  the  ocean  is  limited.  These 
considerations  compel  a  probabilistic  view,  both  of  the  “observed”  ocean  and  especially 
taking  account  that  models  really  solve  for  moments  of  probability  of  possible  states  of  the 
ocean.  We  rethink  how  models  should  work,  here  taking  into  account  statistical 
mechanical  ideas  about  ocean  circulations.  There  is  a  difficulty.  The  problems  that  can  be 
treated  by  methods  of  statistical  mechanics  are  far  from  the  practical  problems  of  ocean 
modelling.  We  entertain  a  hybrid  approach — employing  conventional  ocean  modeling  to 
deal  with  application  of  large  scale  forcing  while  extending  model  physics  to  recognize  the 
oceans’  internal  dynamical  tendency  toward  higher  system  entropy. 

INTRODUCTION 

There  are  many  reasons  to  apply  statistical  methods  in  physical  oceanography,  as  seen  in 
the  many  contributions  in  this  volume.  In  the  present  article  we  focus  on  a  particular 
aspect.  We  ask  to  what  extent  we  may  treat  statistics  of  flows  as  dynamical  objects.  The 
challenge  is  to  determine  what  are  the  equations  of  motion  of  statistics  of  flows. 

This  invites  us  to  reconsider  what  the  “fluid  dynamic  enterprise”  is  about.  In  its  usual 
context,  fluid  dynamics  deals  with  partial  differential  equations  describing  fields  of 
momenta,  density,  and  so  forth.  Given  boundary  and  initial  conditions,  the  goal  is  this; 
solve.  Often  “solve”  is  too  tough,  so  the  strategy  may  be  to  obtain  simplifying 
approximations  or  idealizations,  and  then  solve.  For  most  practical  applications,  “solve” 
includes  also  a  finite  discretization  to  some  numerical  representation  of  the  intended 
equations  of  motion. 

In  ocean  modeling  is  this  really  what  we  do?  I  think  not.  Even  in  a  domain  as  small  as  a 
bay  or  a  harbour,  we  aren’t  given  the  initial  conditions  and  boundary  conditions  at  the  fine 
scales  for  which  actual  equations  of  fluid  flow  apply.  Moreover  we  likely  couldn’t  “solve” 
the  equations  of  motion  if  we  did  have  this  information!  And  the  global  ocean  is  so  much 
bigger. 
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What  to  do?  To  a  large  extent,  ocean  modeling  succeeds  by  luck  and  by  cheating.  We  take 
solutions  to  idealized  problems  and  compare  with  our  partial  information  about  the  real 
ocean.  [Although  computer  models  are  often  characterized  as  “realistic,”  that  should  be 
read  only  as  “less  idealized”  than  some  other  model.]  When  we  compare  solutions  with 
reality,  we  don’t  truly  ask  if  reality  coincides  with  the  solution.  In  reality,  when  we 
measure  velocity  or  temperature  or  elevation  somewhere  at  some  time,  we  see  wiggles, 
whirls,  blibs  and  so  on.  In  part  there  is  always  measurement  error.  In  a  greater  part 
though,  we  appreciate  that  oceanic  flows  are  nearly  always  characterised  by  a  lot  of 
wiggles  and  whirls,  over  many  scales  of  motion.  Thus,  when  we  compare  idealized 
solutions  or  model  output  with  “reality,”  we  should  be  obliged  to  append  a  phrase  “in  the 
mean,”  desperately  hoping  no  astute  reader  asks  what  we  mean  by  “the  mean.” 

Something  statistical  has  got  into  this  “fluid  dynamic  enterprise.”  We  have  compared 
‘apples  with  oranges,’  testing  explicit,  fully  determined  solutions  to  idealized  problems 
against  some  sort  of  statistical  measures  of  reality. 

THE  PHASE  SPACE  OF  THE  OCEAN 

To  pose  the  question  consistently,  we  might  deal  with  probability  throughout.  Let  Y 
represent  the  state  of  the  fluid  at  any  instant.  In  practice,  Y  will  consist  of  some  finite 
representation,  perhaps  the  velocities,  densities  and  whatever  at  many  grid  points,  or 
perhaps  the  coefficients  from  expansion  on  some  set  of  basis  functions.  Y  may  have  a  huge 
number  of  components,  perhaps  a  million  or  more  if  we  think  of  l~rge  scale  supercomputer 
representations  or  we  may  speak  of  zillions  (any  number)  of  components  of  Y.  Vector  Y  is 
a  “point”  in  the  multi-dimensional  “phase  space”  of  all  possible  Y.  Determinisitic  equations 
of  motion  yield  a  trajectory  <fY/<A  =  G(Y).  In  reality,  Y(0  is  fantastically  complicated, 
representing  every  tiny  whirl  and  wiggle  in  the  ocean.  It  seems  doubtful  we  could  ever 
have  so  much  information  or  would  ever  want  it  if  we  could  have  it. 


Aside:  How  big  is  the  phase  space  of  the  ocean?  If  we  think  of  continuous 
fields,  then  size  is  power  of  continuum.  However,  we  recognize  that  there  is 
some  scale  below  which  viscous-diffusive  effects  smooth  the  fields.  That 
scale  depends  upon  turbulent  intensity,  which  varies  greatly.  Moreover, 
velocity,  temperature  and  salinity  will  be  smoothed  at  different  scales 
because  of  their  different  molecular  diffusivities.  The  result  overall  is  that  in 
more  intense  regions  in  the  upper  ocean,  the  smooth  scale  will  be 
significantly  less  than  1  cm.  In  a  weakly  turbulent  deeper  ocean,  the  scale 
may  be  several  cm.  To  make  a  kind  of  “average”  for  back-of-envelope 
estimation,  say  the  number  is  “around"  2  cm.  In  the  ocean  there  are  roughly 
1.3  X  1024  cc  of  water.  If  we  take  the  standard  incompressibility  assiunption, 
we  will  have  four  dependent  variables  (two  components  of  velocity, 
temperature  and  salinity,  say)  in  each  2  x  2  x  2  cc  volume  element.  Thus  the 
size  of  the  phase  space  is  1.3  x  1024  x  4  /  23 ,  or  something  over  6  x  1023. 
What  would  Avogadro  have  thought  of  that?! 
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Of  necessity,  as  well  as  by  thoughtful  intent,  we  wear  “smoky  glasses”  when  looking  at  the 
ocean.  We  do  not  see  a  “point”  Y,  we  see  a  “blur,”  a  cloud  of  possible  Y.  Thus  it  only 
makes  sense  to  speak  of  the  ocean  in  terms  of  probability  p{Y)dY  that  the  actual  state  of 
the  ocean  lies  within  phase  volume  dY  of  some  Y  .  We  then  pose  the  ocean  problem  by 
saying  that  initially  we  have  some/>(Y;/  =  0)  =Pq(Y),  some  probabilistic  statement  of 
boundary  and  forcing  conditions,  and  we  wish  to  solve  for  at  future  /. 

It  seems  we’ve  made  the  problem  worse.  Before  we  had  too  many  Y.  Now,  for  every  Y, 
we  also  want  a  continuous  function  piY).  Moreover,  we  had  at  least  an  equation  of 
motion  for  Y;  what  is  the  equation  of  motion  forp(Y)?  Happily,  things  start  to  get  better. 
Some  of  what  we  really  want  are  only  moments  of/?(Y),  starting  with  first  and  second 
moments:  <  Y  >=  J  Y p{Y)dY  and  <  YY  >=J  YY p{Y)dY.  These  include  things  like  the 

“average”  (“expected”)  current,  temperature  or  salinity,  or  average  heat  transport  or  eddy 
energy,  for  example.  As  well,  when  we  appreciate  that  we  are  only  interested  in  moments 
of  Y,  we  choose  not  to  examine  Y  in  all  its  10^^  phase  space  detail;  10^  or  10^  or  fewer 
numbers  might  be  all  we  care  about.  Although  this  discussion  may  provide  a  viewpoint, 
actual  value  rests  on  displaying  an  explicit  means  of  calculation.  How  do  we  obtain  useful 
<Y>,  say?  What  are  the  equations  of  motion  of  <Y>? 


Aside:  A  topic  often  mentioned  at  this  ‘Aha  Huliko‘a,  and  elsewhere,  is 
chaos.  It  may  be  argued  that  chaotic  behavior  in  low  order 
deterministic  systems  reveals  a  kind  of  dynamics  for  which  we 
thought  ideas  of  probability  were  needed.  Is  the  ocean  chaotic?  If  the 
question  asks  if  nearby  trajectories  Y(t)  diverge  exponentially,  the 
answer  is  surely  yes.  If  the  question  asks  whether  there  exists  a  lower- 
dimension  attracting  object  in  the  phase  space,  again  the  answer  is 
surely  yes.  If  the  practical  question  is  reducing  dimension  from  10^^ 
to  a  mere  10^^,  say,  then  there  is  little  utility  in  finding  such  an 
attractor  (if  we  can  find  such  an  attractor).  A  point  is  that  even  if  we 
could  deal  with  deterministic  dynamics,  we  might  wish  to  introduce 
p(Y)  as  the  object  of  investigation,  with  practical  goals  to  obtain 
expectations  <Y>  and  <YY>,  say. 


OCEAN  MODELING 

First  consider  the  ocean  modelers’  cheat.  We  guess  and  hope  that  equations  for  <Y>  are  a 
lot  like  the  textbook  equations  for  Y.  We  observe  that  if  the  equation  for  Y  were  linear  in 
Y,  we  could  pass  <•>  over  this  equation  and  be  done.  Easy.  Unhappily,  the  equation  is  not 
linear  and  we  are  faced  with  unknown  <YY>  in  the  equation  for  <Y>.  So  we  replace 
<YY>  by  <YxY>  +  <Y'Y'>  where  Y  -  Y  -  <Y>.  This  is  Reynolds  averaging,  here  under 
<•>.  It  helps  some,  but  leaves  unknown  <Y'Y'>.  Now  we  complete  the  cheat  by  copying 
someone  else’s  cheat.  (When  cheating  it  is  ever-so-helpflil  to  copy  others’  cheats.  If  called 
out,  you  can  appeal  to  the  list  of  all  the  cheaters  who  have  gone  before.)  The  standard 
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cheat  is  to  characterize  flux  components  of  <Y'Y'>  by  a  Fickian  relation  to  spatial 
gradients  of  <Y>.  It’s  “eddy  viscosity  ”  The  lovely  thing  about  this  cheat  is  that  the 
equation  of  motion  for  Y  already  has  a  term  like  that,  ascribed  to  difiusion  by  molecular 
chaos.  Thus  the  equation  for  <Y>  really  is  just  the  equation  for  Y  if  only  we  flidge  certain 
coefficients. 

Does  the  cheat  work?  It  is  believed  to  work  (sort  oO  in  a  variety  of  turbulent  flows, 
including  many  industrial  applications.  That’s  encouraging  (maybe).  And  ocean  models 
work  (sort  of)  don’t  they?  The  answer  depends  on  what  we  mean  by  “work.”  Gross 
features  of  upper  ocean  circulation  may  be  more-or-Iess  directly  forced  by  wind  or 
buoyancy.  Integral  measures  such  as  Ekman  transport,  Sverdrup  relation  and  volume 
conservation  already  constrain  gross  <Y>.  So  long  as  models  respect  these  relations,  they 
will  work  (sort  oO  Examined  more  closely,  problems  appear.  Even  the  surface  circulation, 
which  most  feels  direct  forcing,  can  be  problematic  in,  e  g.,  western  boundary  current 
separation  regions.  Moreover,  problems  seen  near  surface  get  worse  as  one  looks  deeper 
in  the  water  column.  Flows  that  run  poleward  along  eastern  boundaries  may  run  the  wrong 
way  in  models;  undercurrents  along  western  boundaries  may  be  absent  or  weak.  Perhaps 
the  standard  cheat  doesn’t  really  work  so  well. 

What  to  do?  There  is  a  standard  fix  for  the  standard  cheat:  Get  a  bigger  computer.  At  finer 
resolution,  less  of  Y  is  left  in  Y',  so  <Y'Y'>  is  smaller  and  the  cheat  can  be  made  smaller. 
Modem  supercomputers  may  advance  a  state  vector  of  length  10’.  We  need  only  await 
10*®-fold  increase  in  computing  power  (speed+memory)  and  the  problem  is  solved 
(Feasibly  one  hopes  that  a  “mere”  100-fold  increase  might  substantially  improve  the 
mesoscale  eddy  problem,  as  one  part  of  the  bigger  problem.) 

STATISTICAL  MECHANICS:  Equilibrium  and  Disequilibrium 

If  the  only  available  method  is  increased  computing  power,  and  if  answers  about  ocean 
circulation  are  needed  sorely  enough,  then  the  necessary  computing  resource  will  have  to 
be  created  and  dedicated  to  this  purpose.  When  (if)  that  could  happen,  at  what  cost,  I 
can’t  guess.  The  question  we  ask  here  is  to  what  extent  theory  of  statistical  mechanics  can 
provide  a  practical  complement  to  ‘bmte  force’  computing. 

Statistical  mechanics  comes  in  two  flavours;  equilibrium  and  disequilibrium.  Equilibrium 
statistical  mechanics  addresses  isolated  dynamical  systems,  seeking  the  p(Y)  “in 
equilibrium”  (i.e.,  if  the  system  has  been  isolated  forever).  Although  this  is  clearly  an 
idealization,  it  is  the  basis  for  understanding  much  of  classical  thermodynamics.  The  more 
difficult  problems  arise  in  disequilibrium  statistical  mechanics,  including  circumstances  of 
open  systems  where  energy  or  information  passes  through  a  system,  or  where  conditions 
change  rapidly  in  time.  Applied  to  macroscopic  fluid  flows,  disequilibrium  statistical 
mechanics  is  better  known  as  turbulence  closure  theory.  One  might,  for  example,  imagine 
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turbulence  theory  helping  with  the  specification  of  eddy  viscosities.  In  fact  we’ll  see 
(shortly)  a  much  bolder  result,  one  that  suggests  the  equations  of  motion  used  by  ocean 
models  are  wrong — and  not  only  by  uncertain  coefficients. 

Both  the  equilibrium  and  disequilibrium  flavours  have  been  exercised  with  respect  to 
geophysical  flows.  It  is  beyond  the  scope  of  this  article  to  review,  or  even  make  mention 
of,  those  exercises.  Reviews  can  be  found  in  Holloway  (1986)  or  Lesieur  (1990).  A  recent 
review  of  turbulence  theories  is  in  McComb  (1990). 

Here  let  us  recall  two  of  the  simplest  examples,  one  from  equilibrium  and  one  from 
disequilibrium.  The  examples  are  chosen  because  they  feed  directly  into  the  practical 
application  which  follows. 

Equilibrium 

First  consider  the  idealized  case  of  barotropic  vorticity  advection  on  an /-plane,  with  rigid 
lid  and  a  bottom  of  variable  depth  H{x).  Without  forcing  or  dissipation,  the  equation  of 
motion  for  Y  is 


DIDt{C+h)  =  0  (1) 

where  D/Dt  is  the  material  derivative  (d/dt  +  u  V),  C  is  the  vertical  component  of 
vorticity  V  x  u ,  and  h  =  /(Hq  -  H)Hq  is  a  potential  vorticity  due  to  variation  of  H  about 
reference  depth  .  Variation  of  H  is  presumed  small  so  \h\  « f  Suppose  the  initial 
conditions  are  random  eddies  without  mean  flow,  hence  <^>  =  0.  We  seek  <^>  at  later 
/  >  0.  For  a  problem  this  simple,  we  can  directly  solve  (1)  for  a  number  of  realizations  of 
then  average  to  get  What  do  we  guess  may  be  the  outcome?  If  <^>  =  0  at  r  =  0 
then  does  <^>  =  0  for  all  time? 

The  numerical  experimental  result  for  mean  velocity  <u>  is  shown  in  Figure  la.  Here  h(x) 
has  been  chosen  to  resemble  the  Arctic  Ocean.  That’s  just  for  fiin,  appreciating  the 
extraordinary  idealization  in  (1).  Clearly  the  answer  is  not  <^>  =  0  =  <u>.  In  fact  the 
answer  is  only  a  subset  of  the  more  general  answer  given  by  Salmon  et  al.  (1976) 
Expressed  in  terms  of  streamfunction  \|r(from  =  Q, 

{aj  a^-V^)<yif>=h.  (2) 

The  derivation  of  (2)  is  based  on  the  observation  that  (1)  conserves  two  integrals  over  the 
flow  field:  energy  ^  =  2  J ^1^ enstrophy  G  =  ( V V + Phase  vector 

Y  may  be  the  collection  of  values  of  \|/  at  grid  points  or  the  coefficients  of  v  expanded  on 
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some  basis  set.  A  great  many  Y  are  consistent  with  most  possible  E  and  Q.  Salmon  et  al. 
make  the  ergodic  hypothesis  that  Y  may  equally  likely  visit  all  possible  configurations 
consistent  with  E  and  Q.  Salmon  et  al.  show  that  the  result  asymptotes  to  (2)  where  a, 
and  04  are  Lagrange  multipliers  to  enforce  the  constraints  to  E  and  Q.  A  simple  route  to 
the  result,  maximizing  entropy  S  =  jdy  p(Y)logp{Y)  subject  to  E  and  ^  is  in  an  appendix 
to  Holloway  (1992,  hereafter  H92). 

An  immediate  consequence  of  (2)  is  that  the  answer  is  not  <u>==0.  Observe  that  this  is 
qualitatively  contrary  to  any  manner  of  ecUfy  viscosity  that  ultimately  seeks  to  drag  mean 
flow  toward  a  state  of  rest.  We  observe  also  that  in  (2),  only  the  ratio  a,/a2  appears,  which 
may  be  expressed  in  terms  of  a  length  scale  L}  =  Oj/a,  If  it  happens  that  we  are  only 
interested  in  <\|/>  on  scales  larger  than  L,  we  can  omit  in  (2)  and  write  approximately 
the  simplest  ever  “theory”  of  ocean  circulation;  <\j/>  =  Uh.  No  wind,  no  sun,  no  rain,  no 
moon,  no  ice,  no  whales. 

Can  it  be  so?  Figure  lb  shows  <u>  obtain^  from  our  simplest  ever  <y>  =  L^h.  Although 
the  two  panels  look  similar,  they  are  not  identical  if  overlaid.  Figure  la  was  produced  by 
averaging  eleven  realizations  afrer  a  few  thousand  timesteps  each,  at  cost  of  about  80 
hours  CPU  on  an  Alliant  FX40.  (I  meant  to  collect  an  even  dozen  realizations  but  ran  out 
of  time.)  My  hunch  is  that  after  another  80  hours  the  average  of  direct  realizations  would 
have  got  closer  to  Figure  lb.  The  simplest  ever  calculation  of  <u>  used  about  1  second  on 
a  Macintosh  (wall  clock).  [A  note.  Overall  velocity  amplitude  is  not  shown  in  Figure  1 . 


Figure  1.  (a)  An  ensemble  average  at  statistical  stationarity  over  eleven  realizations  at  256  x  256 
resolution  of  solutions  of  ( 1)  from  initial  conditions  consisting  of  random  eddies.  The  ensend>le  average 
flow  is  presented  at  32  x  32  resolution,  (b)  The  approximate  theoretical  flow,  given  by  <v>  =  L^h. 
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This  can  be  scaled,  so  Figures  la  and  lb  can  be  made  to  fit  in  terms  of  overall  amplitude] 
It  may  seem  a  bit  uncanny  that  Figure  1  has  a  lot  in  common  with  the  synthesis  of  Arctic 
circulation  developed  by  Aagaard  (1989).  Happenchance,  doubtless. 


Aside:  Even  without  addressing  really  difficult  matters  such  as  forcing 
and  dissipation,  the  equilibrium  statistical  mechanics  of  (1)  is  not  as 
straightforward  as  I’ve  made  it  seem.  The  concern  is  that  we’ve  sought 
the  maximum  entropy  solution  subject  only  to  constraints  to  £  and  Q. 
In  fact,  within  an  enclosed  basin  or  for  other  special  boundary 
conditions  such  as  channel  flow,  the  continuum  solution  to  (1) 

preserves  jdAgiV^yz+h))  where  g  is  any  function.  The  solution  is 

enormously  more  constrained  than  we’ve  taken  into  account.  So  why 
does  simple  constraint  by  £  and  Q  seem  to  “work”?  Discussions  on 
this  point  have  gone  on  for  years,  and  can’t  be  dealt  with  in  this 
limited  space.  There  are  just  a  few  comments:  (a)  The  discrete 
numerical  representation  of  (1)  is  not  faithful  to  the  continuum 
invariants  of  (1),  and  we  really  test  statistical  mechanics  against  the 
numerical  (1).  (b)  It  may  be  that  even  if  a  hierarchy  of  invariants  does 
exist,  the  invariants  don’t  constrain  the  available  phase  space  in  ways 
that  very  much  affect  low  order  moments  such  as  <Y>.  (c)  One 
invariant  which  numerical  schemes  may  respect  is  circulation  C  = 

jdAV^y/.  In  fact  this  can  be  taken  into  account,  adding  another 

Lagrange  multiplier  03  on  the  right  side  of  (2),  while  stipulating  also  a 
value  V  =  on  a  closed  domain  boundary.  The  result  is  to  modify  a 
boundary  current  of  width  L 


In  truth,  simplest  ever  <v>  =  L^h  can’t  really  serve  as  a  theory  of  ocean  circulation — in  part 
because  of  what  model  (1)  leaves  out.  Like  sun  and  wind  and  seagulls.  The  difficulty  is 
that  equilibrium  statistical  mechanics  applies  to  isolated  (closed)  systems,  whereas  the 
ocean  is  subject  to  external  forcing  and  internal  dissipation.  The  more  difficult  task  to 
include  forcing  and  dissipation  falls  under  the  category  of  disequilibrium  statistical 
mechanics. 

Disequilibrium 

Consider  the  simplest  extension  of  our  simple  model  (1),  including  some  dissipation 
operator  such  as  -a^+bV^^.  To  achieve  statistical  stationarity,  the  flow  might  be  excited 
Wunder  some  probability  distribution  of  random  torques.  These  are  the  open  connections 
which  prevent  application  of  simpler  equilibrium  methods.  For  disequilibrium  calculation 
one  often  permits  “slow”  time  dependence  of  low  order  statistics,  “slow”  in  the  sense  that 
higher  moments  remain  in  quasi-steady  adjustment  to  low  order  moments.  In  fluid 
dynamics  context,  this  leads  to  the  turbulence  “closure”  problem. 
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Let  us  maintain  simplicity  by  assuming  in  (1 )  that  topography  h  is  entirely  random.  Among 
realizations,  the  topography  is  chosen  independently  without  any  mean  <h>.  This  permits 
us  to  consider  the  case  of  no  mean  <^>,  simplifying  ensuing  moment  equations.  Assuming 
that  second  moments  of  h,  say  a  wavenumber  variance  spectrum  Hik),  is  given,  the  first 
nontrivial  moments  we  come  to  are  >  and  <C^>,  perhaps  characterized  by 
wavenumber  spectra  of  vorticity  variance  and  of  vorticity-topography  correlation.  The 
straightforward  approach  is  to  multiply  (1)  by  C  or  by  /i,  and  average  <*>.  The  problem  is 
that  this  generates  new  terms  <(Jih>.  Continuing  by  building  equations 

for  these  <YYY>  only  generates  new  unknown  <YYYY>,  and  so  on  indefinitely  in 
explosive  proliferation.  Many  efforts  have  been  made  to  “close”  the  hierarchy  of  moment 
equations,  which  I’ll  not  begin  to  recount  here.  An  excellent  recent  reference  is  McComb 
(1990). 

The  first  disequilibrium  theoretical  results  for  the  case  of  barotropic  vorticity-topography 
interaction  were  those  of  Herring  (1977)  and  Holloway  (1978),  the  latter  comparing  with 
direct  numerical  experiments.  Comparisons  were  encouragingly  (surprisingly?)  good.  So 
this  is  the  way  to  go? 

Unfortunately  it’s  not  so  good  for  several  reasons.  First,  the  “theories”  are  all  subject  to  a 
certain  amount  of  “tuning”.  [My  personal  view  is  that  even  recent  theories  which  claim  to 
be  “free  of  phenomenological  constants”  have  actually  only  found  more  clever  ways  to 
hide  the  “adjustments.”]  One  may  lack  confidence  in  the  generalizing  power  of  these 
theories  A  seco  i,  and  more  damaging,  shortcoming  is  that  actual  calculation  from  these 
theories  appeals  only  to  be  feasible  in  highly  idealized  geometries  corresponding  to 
statistically  homogeneous  (or  nearly  so)  fields.  Often  one  appeals  as  well  to  statistical 
isotropy  (or  near  thereto).  Third,  even  with  these  idealizations,  calculation  of  the 
theoretical  results  demands  nearly  as  much  computing  effort  as  direct  simulations.  [This  is 
especially  disappointing  when  one  has  to  do  the  direct  simulations  anyway — to  see  if  the 
theory  is  right  ]  Fourth,  upon  attempting  to  simplify  the  theoretical  results  (Holloway, 
1987),  they  remain  too  unwieldy  for  practical  exercise  in  ocean  models.  Finally,  the 
equations  of  motion  for  which  these  results  are  available  may  be  only  idealizations  (such  as 
quasigeostrophy)  from  the  equations  intended  for  actual  ocean  models. 

The  disequilibrium  studies,  like  the  equilibrium  studies,  may  be  regarded  as  esoteric 
playthings  for  theoreticians.  Yet  there  are  lessons  to  be  learned.  First,  there  is  an  important 
connection  between  the  two  approaches.  It  is  entropy.  In  the  disequilibrium  studies, 
particularly  as  seen  in  turbulence  closures,  often  there  is  no  explicit  discussion  on  entropy. 
However,  when  a  theory  yields  any  set  of  second  moments  <YY>,  say,  one  may  evaluate 
the  enstropy  S  subject  to  those  <YY>.  George  Camevale  in  his  thesis  (1979)  has  carefully 
considered  this,  including  the  case  of  vorticity-topography  interaction.  The  entropy  can  be 
expressed  5=1/2  log  det  <YY>,  and  George  shows  that  a  broad  class  of  turbulence 
closures  demonstrate  the  Second  Law;  dSIdt  >  0  in  the  absence  of  external  forcing  or 
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dissipation  (Camevale  et  al.,  1981).  Turbulence  closure  extends  the  calculation  to  show 
“outside”  influence  (forcing  and  dissipation,  the  latter  “outside”  the  Y  treated  by  models). 
As  well,  authors  have  considered  the  insights  gleaned  from  consideration  of  equilibrium 
solutions  even  when  one  ultimately  has  forced-dissipative  reality  in  mind.  See  Frederiksen 
(1982,  1985,  1986),  Camevale  and  Frederiksen  (1987)  or  Holloway  (1986).  Nonetheless, 
“insight”  is  a  matter  of  point-of-view,  and  the  present  point-of-view  is  far  outside 
mainstream  ocean  dynamics. 

ALTERNATIVELY,  A  HYBRTO? 

If  statistical  mechanics  only  provides  a  point-of-view,  whose  insights  are  a  matter  of  taste, 
what  can  we  do  practically  about  ocean  modelling?  Get  a  bigger  computer.  Sure.  Use 
eddy  viscosity,  explictly  or  via  numerical  diffusion  (because  our  forefathers  have  always 
done  so).  Business  as  usual. 

We  may  try  to  do  better  than  that  for  two  reasons.  First,  it  is  not  clear  that  business-as- 
usual  is  on  the  path  to  success.  Doubtless  bigger  computers  will  help,  but  we  also  know 
that  eddy  viscosities  are  wrong  because  we  that  know  eddy-topography  interactions  (for 
example)  drive  rather  than  damp  mean  flows.  Second,  no  matter  how  big  the  computer, 
there  will  be  a  host  of  pressing  questions  we  seek  to  answer,  all  of  which  will  be 
compromised  if  we  must  expend  computing  resource  to  achieve  super-high  resolution. 

Is  there  an  alternative?  It’s  not  clear.  To  proceed  beyond  the  highly  idealized  statistical 
mechanical  calculations  requires  bold  leaps,  perhaps  along  lines  suggested  in  H92.  While 
bold  leaps  may  be  exhilarating,  are  they  scientific  and  do  they  practically  contribute?  [We 
should  remain  aware  that  eddy  viscosity  is  a  leap,  one  that  is  plain  wrong  but  only 
sanctioned  by  past  use.]  For  the  present  we  consider  the  leaps — and  their  consequences. 
Let  me  recall  only  briefly  discussion  from  H92  by  way  of  introducing  the  following  (this 
volume)  paper  by  Eby  and  Holloway  (hereafter  EH). 

We  begin  by  recognizing  that  oceans  are  subject  to  external  forcing  and  internal 
dissipation.  To  the  extent  that  forcing  is  on  relatively  large  scales  (atmospheric  synoptic 
scale  up  to  planetary  scale),  this  does  not  present  a  severe  problem.  Of  course  data 
uncertainty  is  always  an  issue.  Coastal  zone  forcing  may  be  a  largely  under-appreciated 
problem.  Internal  dissipation  is  parameterized  in  some  haphazard  fashion;  but  it  is  beyond 
the  scope  of  this  paper  to  address  that.  Thus  we  arrive  at  rudimentary  ocean  modelling; 
calculating  the  (parameterized)  viscous  response  to  imposed  forcing. 

What  this  picture  leaves  out  are  tendencies  due  to  the  internal,  largely  “free”  dynamics  of 
the  richly  nonlinear,  KP^-mode  ocean.  On  account  of  forcing  and  dissipation,  ocean 
models  are  dragged  away  from  the  higher  entropy  state  to  which  the  internal  dynamics 
would  otherwise  tend.  As  the  ocean  model  is  drawn  away,  it  ought  to  feel  a  force  tending 
to  restore  toward  higher  entropy.  Neither  is  this  just  a  loose  way  of  talking;  the  force  the 
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ocean  model  should  be  feeling  is  very  much  like  the  tension  in  a  stretched  elastic  (see  the 
entropy  calculus  in  Kubo  (1965,  ch.  1,  ex.  13)).  By  omitting  this  “entropy  force,”  ocean 
models  get  the  equations  of  motion  wrong. 

To  include  the  entropy  tendency  in  ocean  models,  we  need  three  things; 

1 .  ability  to  characterize  the  maximum  entropy  configuration  of  the  ocean, 

2.  a  measure  of  difference  between  the  model  solution  and  maximum  entropy,  and 

3 .  a  “spring  constant”  for  the  strength  of  tendency  toward  higher  entropy. 

Although  we  have  none  of  these  three  rigorously,  plausible  guesses  can  be  made.  Such 
guessing  may  annoy  more  serious-minded  colleagues.  However,  to  avoid  guessing  means 
falling  back  onto  eddy  viscosity — a  far  worse  guess.  I  make  the  following  guesses  in  part 
hoping  to  draw  the  attention  of  bright  talent  that  one  day  will  straighten  this  stuff  out. 

Unprejudiced  circulation 

We  begin  guessing  from  the  barotropic,  quasigeostrophic  <v>  =  L^h.  There  are  two 
immediate  objections;  it’s  barotropic  and  quasigeostrophic.  The  barotropic  aspect  isn’t  so 
bad.  If  we  have  in  mind  large  scale  ocean  modelling  in  which  we  do  not  choose  to  resolve 
the  first  internal  radius  of  deformation,  then  the  theory  of  Salmon  et  al.  (1975)  shows  that 
statistical  equilibrium  is  nearly  barotropic  on  scales  larger  than  the  first  radius.  Of  course 
the  actual  ocean  is  not  barotropic;  but  we  readily  understand  that  in  terms  of  the  baroclinic 
projection  of  applied  forcing. 

Quasigeostrophy  is  more  difficult,  in  part  because  the  potential  vorticity  fluctuation 
h  =  f  {Hq  -  H)Hq  should  involve  only  small  departures  of  total  depth  H  from  a  reference 
depth  Hq.  We  mean  to  apply  these  ideas  to  primitive  equation,  full  depth  ocean  models. 
Quasigeostrophy  is  ambiguous  whether  refers  to  velocity  streamfunction  or  to  depth- 
integrated  transport  streamfunction.  H92  considers  both,  the  former  leading  to 
O*  =  /  Hq  and  the  latter  to  O*  =  where  <I>*  is  introduced  to  denote 

transport  streamfunction  at  maximum  entropy.  These  formulae  were  suggested  with 
constant /in  mind,  appreciating  that  spatial  scales  of  variation  of  H  will  be  small  compared 
with  planetary  radius.  Things  will  break  down  if  we  apply  such  formulae  carelessly,  say  to 
a  flat-bottomed  beta-plane  “ocean.”  For  the  present,  the  aim  is  to  proceed  most  simply 
with  realistic-geometry  practical  modelling  in  mind. 

Serious-minded  colleagues  may  be  apalled  by  the  uncertainty  over  which  formula  to  use 
for  O*.  So  be  it.  In  practice,  EH  find  little  difference  between  the  two  as  compared  with 
the  larger  differences  from  conventional  modelling  (the  guess  that  O*  =  0).  Of  the  two 
formulae  above,  the  former  requires  assigning  while  the  latter  present  the  possibility  of 
velocity  singularities  as  H-^0.  The  latter  difficulty  seems  harmless  because,  first,  models 
approach  H— >0  discretely  and,  second,  in  shallower  water  direct  forcing  and  dissipation 
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tend  to  over-ride  the  statistical  mechanical  tendency.  Therefore,  with  short  term 
expedience  in  mind,  we  adopt  an  “unprejudiced  circulation” 

=  (3) 

where  “unprejudiced”  refers  to  the  “least  biased”  (minimum  information)  aspect  of 
maximum  entropy.  (A  guess  O*  =  0  is,  by  comparison,  one  of  extreme  prejudice.)  We  are 
left  to  assign  l}.  In  the  ideal  case  of  inviscid  equilibrium,  =  a^/oi  is  a  ratio  of  Lagrange 

multipliers  determined  from  E  and  Q  and  the  number  of  retained  degrees  of  freedom.  In 
practice,  we  have  left  L  a  disposable  fudge  factor,  presumably  reflecting  a  length  scale 
somewhat  shorter  than  actual  eddy  length  scales.  At  absolute  (ideal)  equilibrium,  one 
expects  1}  to  take  a  single  value  characterizing  all  of  the  domain,  in  just  the  sense  that 
temperature  comes  to  be  uniform  for  an  isolated  system.  In  practice,  we  understand  that 
the  disparate  regions  of  the  forced,  dissipative  ocean  are  only  weakly  in  “thermal” 
(statistical  mechanical)  contact.  Hence  we  expect  that  1}  may  have  weak  geographic 
dependence,  tending  to  follow  eddy  length  scales.  A  natural  suggestion  is  to  tie  L  to 
regionally  smoothed  first  deformation  radius;  we  (EH)  have  not  yet  explored  this.  The 
shorter  term  expedient  is  simply  to  allow  that  L  should  be  somewhat  larger  at  lower 
latitudes.  In  EH  and  subsequent  experiments,  we’ve  let  L  range  from  a  few  km  in  the 
Arctic  to  a  few  tens  of  km  near  the  equator.  It  is  an  uncertainty  that  ought  to  be  narrowed 
in  future  work. 

When  an  actual  ocean  model  is  executed,  its  output  will  differ  fi'om  (3).  How  to  measure 
that  difference  will  depend  upon  specifics  of  the  model  under  consideration.  As  a 
representative  model  (without  implication  concerning  its  putative  strengths  or 
weaknesses),  EH  have  considered  the  GFDL  “Modular  Ocean  Model,”  a  successor  to  the 
model  described  by  Cox  (1984).  Prognostic  variables  include  velocity,  temperature  and 
salinity  fields.  Because  the  velocity  field  is  split  to  external  (depth  integrated)  and  internal 
(baroclinic)  parts,  with  the  external  part  described  by  a  transport  streamfunction,  it  is 
natural  to  measure  the  difference  between  model  streamfunction  and  <I>*.  Continuing  in 
this  simplest  r/ay,  EH  append  a  term  in  which  the  model  relaxes  toward  <I>*  with  a  given 
time  constant.  This  is  not  just  a  “quick  fix.”  The  appended  term  corresponds  to  the  tension 
in  a  stretched  elastic  as  mentioned  earlier.  It  reflects  missing  physics  in  conventional  model 
formulation. 

Temperature  and  salinity  equations  are  not  affected  by  the  entropy  tendency  here 
considered.  This  is  because  we  only  treat  scales  larger  than  the  first  radius  of  deformation. 
If  one  sought  to  apply  these  ideas  at  smaller  scales,  then  “entropy  forcing”  terms  should 
appear  also  in  the  temperature  and  salinity  equations. 
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It  remains  to  specify  a  time  constant  for  the  streamfimction  restoring  With  simplicity  in 
mind,  EH  choose  25  days  (while  also  exploring  sensitivity  to  other  choices).  These  first 
experiments  with  a  hybrid  model  are  discussed  in  a  following  article  (EH). 


Warning:  Reader  discretion  is  advised. 

The  following  article  (Eby  and  Holloway)  is  rated  R.  This  material  has 
been  Rejected  for  responsible"  publication.  Not  merely  Rejected  but 
Rejected  with  vehemence  as  “...  silly  ...  sketchy  ...  at  best  trivial ... 
cavalier  ...  tenuous  ...  just  emother  formula  ..." 

*  The  Rejection  committee  notes  with  concern  that  'Aha  Huliko'a 
follows  an  alarming  practice  of  irresponsibly  publishing  dangerous 
ideas. 


The  work  reported  by  EH  is  only  a  very  first  exploration,  trying  to  see  if  it  is  worth  the 
effort  to  further  pursue  this  line.  We  do  feel  encouraged  that,  first,  inclusion  of  entropy 
tendency  makes  a  difference  and,  second,  the  sense  of  the  difference  appears  to  be  toward 
improving  model  fidelity.  Can  we  do  better? 

There  will  need  to  be  renewed  theoretical  effort  with  respect  to  matters  like  1}  and  how 
the  “spring  constant”  varies  at  different  scales  of  motion.  Some  aspects  for  improvement 
are  clear,  even  if  not  precisely  so.  Relaxing  streamfunction,  as  EH,  means  that  the  largest 
scales  of  motion  tend  to  higher  entropy  as  quickly  as  smaller  scales,  whereas  both  theoiy 
and  idealized  experiments  show  that  smaller  scales  should  adjust  more  quickly.  This 
suggests  filtering  the  relaxation  process  by  some  practically  convenient  operator  such  as 
V2,  which  is  already  present  in  the  momentum  equation.  Eddy  viscosity\  Our  earlier 
complaint  about  eddy  viscosity  is  not  so  much  with  the  differential  operator  as  with  the 
aspect  that  conventional  eddy  viscosity  drags  models  toward  a  state  of  rest  (extreme 
prejudice!)  If  instead  we  define  from  (3)  a  maximum  entropy,  barotropic  (at  scales  larger 
than  first  radius)  flow 


//u*  =  zxV<D*  (4) 

then  we  can  center  the  eddy  viscosity  not  about  the  state  of  rest  but  about  u*.  The  eddy 
viscosity  term  is  given  by  A  V^(u  -u*),  where  A  might  be  a  simple  constant  coefficient.  Of 
course  one  could  also  try  to  be  more  sophisticated  about  how  A  may  vary.  For  coarse 
resolution  modelling,  temperature  and  salinity  equations  remain  unaffected.  (In  fact, 
horizontal  eddy  diffusion  of  density  is  consistent  with  the  tendency  toward  barotropic  u*  .) 
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Figure  2.  (a)  Flow  at  245  m  after  150  years  integration  (“control  case”)  under  annual  mean  windstress 
and  surface  layer  restoring  to  annual  temperature  and  salinity,  (b)  Flow  at  245  m.  when  eddy  viscosity  is 

modified  to  A  (u  -  u*) ,  with  A  =  2  x  10^  m^/s  and  L  (a  factor  in  u  * )  increasing  from  3  km  at  the  pole 
to  15  V  rn  at  the  equator. 


Figure  3.  (a)  Flow  at  245  m.  for  the  “control  case”  (Fig.  2).  Contours  show  the  implied  air-sea  heat 
exchange  due  to  surface  restoring  to  annual  mean  temperature,  (b)  Flow  at  245  m  and  implied  air-sea 
heat  exchange  when  eddy  viscosity  is  modified  (as  Fig.  2). 


Michael  Eby  has  performed  newer  experiments  replacing  conventional  eddy  viscosity  by 
A  (u  -  u*).  While  the  experiments  will  be  described  fully  in  a  subsequent  paper  (in 

prep.),  Michael  has  kindly  provided  me  a  few  figures  by  way  of  preview.  Results  are 
shown  after  150  years  under  steady  forcing  by  Hellerman-Rosenstein  (1983)  winds  and 
surface  layer  relaxation  of  temperature  and  salinity  toward  mean  Levitus  (1982).  Figure  2 
shows  flow  at  245  m  in  the  northeast  Pacific,  comparing  the  control  case  without  u* 
(panel  a)  and  the  test  case  (panel  b).  Differences  between  these  two  panels  are  evident.  As 
well,  panel  b  differs  from  EH  insofar  as  the  eastern  boundary  poleward  flow  is  more 


504 


HOLLOWAY 


narrowly  confined  over  the  upper  continental  slope  (appreciating  the  coarseness  of 
resolution,  here  with  roughly  1 .9°  grid  spacing).  Another  difference  between  recent  work 
and  the  earlier  EH  is  a  clearer  upper  ocean  expression  of  the  entropy  tendency  which  is 
not  being  entirely  over-ridden  by  direct  forcing.  Deeper  in  the  water  column  (not  shown), 
the  entropy  tendency  dominates. 

Figure  3  shows  flow  at  245  m  in  the  northwest  Pacific.  Differences  are  apparent  between 
the  control  (panel  a)  and  test  (panel  b).  Inclusion  of  the  entropy  tendency  in  panel  b  results 
in  stronger  Oyashio  and  Sakhalin  Currents,  with  Kuroshio  confluence  at  lower  latitude. 
Note  especially  within  the  Sea  of  Japan  that  the  circulations  in  panels  a  and  b  are  of 
opposite  sign,  with  panel  b  supporting  a  warm  Tsushima  Current  along  the  west  coast  of 
Japan  (and  a  colder  southward  Korea  Current).  These  differences  of  circulation  (here 
shown  at  245  m)  are  so  large  that  they  affect  implied  air-sea  heat  and  freshwater 
exchanges.  Contours  on  Figure  3  are  implied  heat  exchange  (W/m^)  given  by  the  model 
relaxation  to  Levitus  atlas.  Negative  values  denote  oceanic  heat  loss.  Differences  between 
panels  a  and  b  are  significant,  with  the  control  case  (panel  a)  implying  annual  mean  heat 
loss  in  excess  of  300  W/m^  with  peak  loss  occurring  north  of  40N  while  the  test  case 
(panel  b)  shows  peak  heat  loss  reduced  by  about  100  W/m^  and  shifted  to  lower  latitude. 

ACONCLUSION 

We  are  at  a  beginning,  not  a  conclusion.  I  think  we  can  say  that  the  equations  of  motion 
assumed  by  conventional  ocean  modelling  are  wrong.  They  are  wrong  because  they  do  not 
take  account  of  forces  which  should  arise  due  to  the  dependence  of  system  entropy  upon 
the  model  state.  What  we  are  just  beginning  is  the  attempt  to  improve  (complete)  the 
equations  of  motion.  Certain  calculations  can  be  made  with  care  for  both  equilibrium  and 
disequilibrium  statistical  mechanics.  Unhappily,  the  idealizations  required  to  effect  these 
calculations  prevent  direct  practical  application.  To  pass  beyond  this  point  with  only 
present  knowledge  requires  bold  leaps.  The  unsupported,  vague  nature  of  those  leaps  may 
annoy  more  serious-minded  colleagues.  Yet  I  think  the  leaps  can  accomplish  two  things 
(apart  from  sheer  fun):  first,  to  get  a  glimpse  of  what  may  lie  ahead,  and,  second,  to  attract 
interested  talent  that  may  strengthen  the  basis  for  surer  leaps  in  the  future. 

First  exercises  have  been  chosen  with  simplicity  in  mind.  We’ve  considered  large  scale, 
primitive  equation,  prognostic  ocean  modelling.  By  considering  only  resolution  coarse 
compared  with  first  internal  radius  of  deformation,  it  is  possible  to  define  a  maximum 
entropy  (“unprejudiced”)  circulation  u*,  which  is  barotropic  at  the  coarse  scale.  [See  Fig. 

1  of  EH.]  Ambiguities  arise  when  we  seek  to  apply  results  from  inviscid  quasigeostrophy 
to  a  model  based  upon  forced-dissipative,  full-depth,  primitive  equations.  Certain  choices 
(leaps)  were  made  that  will  surely  be  refined  in  future  work. 

Further  exercises  will  look  in  two  directions.  We  must  look  back,  seeldng  a  better  footing 
for  the  uncertain  leaps.  And  we  continue  to  look  forward.  To  date  we  have  explored 
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modification  of  a  particular  ocean  model,  chosen  for  its  history  and  wide  usage.  There  are 
other  prognostic  ocean  models,  differently  formulated  in  terms  of  variables  or  grids 
(isopycnal  layers,  terrain-following  coordinates, .  .  .)  and  in  terms  of  equation  sets 
(thermocline  equations,  semi-geostrophy,  balance, .  .  .).  At  finer  resolution  (coastal  zone 
studies,  individual  seamounts, .  .  .)  effects  of  stratification  will  need  be  taken  into  account  in 
the  maximum  entropy  solution.  The  combination  of  stratification  and  targe  amplitude 
topography  may  be  especially  vexing.  In  coarse  resolution  exercises  thus  far,  we’ve 
avoided  eddy-active  models.  As  higher  resolution  models  “admit”  eddies  (whether  or  not 
adequately  “resolving”  them),  some  of  the  entropy  tendency  will  be  expected  to  appear 
within  the  explicit  model.  How  to  complement  such  eddy-active  models  with  entropy 
tendency?  (If  one  follows  the  A  V^(u-u*)  implementation,  it  is  natural  to  reduced  at 
higher  resolution.  One  might  also  substitute  other  operators  such  as  V-*.) 

What  we  do  with  prognostic  models  suggests  modification  also  to  diagnostic  or  inverse 
models.  To  the  extent  that  an  inverse  model  may  be  constrained  by  equations  of  motion, 
“improved”  equations  for  prognostic  modelling  should  improve  the  quality  of  inverse 
solutions.  As  well,  inverse  models  often  minimize  a  cost  function  There  it  is  natural  to 
append  a  penalty  for  distance  from  maximum  entropy,  thereby  seeking  a  “least  prejudiced” 
inverse  solution.  An  important  consideration  is  to  include  uncertain  parameters  (Z,^, .  .  .)  as 
parts  of  the  inverse  solution,  evaluating  them  by  best  fit  to  data. 

Finally,  we  can  choose  to  ignore  all  this  stuff.  When  computers  someday  get  to  be 
powerful  enough  (and  if  one  hasn’t  anything  else  interesting  to  compute),  then  we  can  just 
clobber  everything  by  brute  force. 
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ABSTRACT 

A  hybrid  statistical  mechanics  /  ocean  circulation  model  is  tested.  A  conventional 
ocean  model  was  revised  to  include  a  tendeiKy  for  model  streamfiinction  to  relax  toward  a 
maximum  entropy  configuration  which  depends  on  the  shape  of  topogr:q)hy.  The  tendency 
is  called  “topographic  stress”.  Comparisons  are  made  between  three  cases;  a  control  case 
with  streamfimction  relaxation  toward  rest  and  two  implementations  of  topographic  stress 
(differing  by  their  functional  dependence  on  total  depth).  The  two  topographic  stress  cases 
perform  similarly,  but  they  differ  from  the  control  case  in  several  regards.  Topographic 
stress  strengthens  equatorward  tendetKies  in  deep  western  boundary  currents,  sustains  a  deep 
Alaska  Stream,  and  leads  to  poleward  eastern  boundary  undercurrents  which  are  absent  in 
the  control.  In  the  upper  water  column,  where  direct  wind  and  buoyaiKy  forcing  dominate, 
the  influence  of  topographic  stress  is  slight. 

INTRODUCTION 

Oceanic  general  circulation  models  (OGCM)  are  used  to  advance  our  understanding  of 
physical  processes  in  the  ocean.  Increasingly,  OGCMs  are  being  coupled  to  atmospheric  models 
and  used  to  predict  climate  change.  Most  models,  however,  are  not  capable  of  simulating  present 
day  ocean  climatology  accurately  enough  to  provide  a  confident  basis  for  predictions.  We  ate 
motivated  to  search  for  systematic  defects  which  afflict  these  models.  In  particular,  OGCMs 
which  are  global  in  their  domain  and  used  for  prediction  over  decades  or  longer  are  necessarily  of 
r  ''^tively  coarse  resolution.  Oceanic  eddies  on  length  scales  of  tens  of  kilometers  are  either  not 
resolved,  or  are  only  marginally  resolved  in  ways  that  may  corrupt  their  dynamics.  It  is  therefore 
important  to  find  a  representation  of  unresolved  eddies  which  is  of  sufficient  skill  to  better 
recover  present  day  ocean  climatology,  providing  an  improved  basis  for  climate  change  studies. 

It  has  been  suggested  by  Holloway  (1992)  that  eddies  interact  with  bottom  topography 
to  generate  pressure-slope  correlations,  possibly  exerting  large  systematic  forces  (topographic 
stress)  upon  mean  circulation.  The  usual  eddy  paiameterizations  in  terms  of  bottom  drag  or  eddy 
viscosity  move  a  model  towards  a  state  of  test,  whereas  topogrtqphic  stress  may  be  a  driving 
force  behind  mean  flows.  It  is  suggested  that  a  more  skillful  representation  of  unresolved  eddies 
may  be  given  by  the  tendency  toward  higher  system  entropy. 

Statistical  dynamical  tendencies  were  examined  by  Salmon  et  al.  (1976)  in  the  context 
of  ideal  quasi-geostrophic  dynamics.  Among  their  simplest  results  is  the  expectation  that, 
on  scales  larger  than  the  first  deformation  radius,  motimi  should  tend  to  be  barotropic  and 
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given  by  a  streamfunction  satisfying 

(a/^-V*)<V»>=/»  (1) 

where  is  the  2-dimensional  Laplacian,  a//5  is  a  ratio  of  Lagrange  multipliers  (due  to 
dynamics  which  conserve  energy  and  enstrophy).  h  =  fSH/H  is  the  poteittial  vmticity  due  to 
variation  6H  about  mean  depth  H,  and  /  is  the  Coriolis  parameter.  This  equation  implies  that 
an  ocean  with  no  external  forcing,  filled  with  random  eddies  (without  mean  motion),  would 
tend  to  set  up  a  mean  flow  (V*)  that  depends  on  the  topography  (h). 

In  reality,  the  ocean  has  external  forcing  and  internal  dissipation,  and  thus  is  not  a  closed 
system  to  which  maximal  entropy  solutions  apply.  The  state  of  actual  ocean  circulatitxi  is 
achieved  as  a  balance  between  entropy-itKteasing  tendencies  on  account  of  eddy  interactions  and 
entropy-limiting  tendencies  due  to  forcing  and  dissipation.  OGCMs  already  have  modest  skill  to 
include  large  scale  forcing,  while  internal  dissipatitm  is  parameterized  more  haphazardly  (in  part 
due  to  pooriy  understood  eddies).  What  OGCMs  omit  is  the  eddy  tendency  toward  increasing 
entropy.  We  investigate  the  effects  of  modifying  an  OGCM  such  that  the  models  would  relax 
not  toward  rest,  but  rather  toward  a  solution  such  as  (1).  There  is  a  theoretical  le^>  in  aj^ying 
a  parameterization  based  on  quasi-geostrophy  to  a  primitive  equation  model.  For  diis  reason, 
as  well  as  uncertainty  in  how  to  characterize  the  competition  between  forcing-dissipation  and 
topographic  stress,  we  do  not  know  precisely  how  to  proceed.  What  we  do  hope  is  that  this  study 
wiU  help  motivate  futther  theoretical  work  and  demonstrate  that  the  inclusion  of  a  relatively 
crude  parameterization  of  topographic  stress  already  improves  the  quality  of  model  simulations. 

IMPLEMENTATION 

The  model  chosen  to  study  the  effects  of  topographic  stress  was  the  GFDL  l^xlular  Ocean 
Model  (MOM)  (Pacanowski  et  al.  1991)  which  is  based  on  code  oripnally  formulated  by 
Bryan  (1%9)  and  further  developed  by  Semtner  (1974)  and  Cox  (1984).  Versions  of  this  three 
dimensional,  primitive  equation  model  are  widely  used  (Killworth  et  al.  1991). 

MOM  calculates  velocity  as  internal  (baroclinic)  and  external  (barotropic)  tirades.  From 
a  voiticity  tendency,  the  model  solves  an  elliptic  equation  for  transport  streamfunction  from 
which  it  obtains  the  external  mode  of  velocity.  We  wiU  be  usitig  MOM  at  a  grid  resolution 
more  coarse  than  the  first  deformation  nutius,  hence  at  scales  for  which  the  maximum 
entropy  solution  is  barotropir.  Th  is,  we  can  introduce  a  simple  relaxation  of  the  model 
streamfunction  toward  that  given  by  (1). 

There  is  a  further  simplification  as  well  ’'.s  certain  ambiguities  which  arise  in  implication 
based  upon  (1).  The  ratio  ajP  is  not  well  defined  in  reality,  since  its  theoretical  motivation 
depends  uptm  artifacts  such  as  finite  spectral  truncation.  However,  a/j9  =  1/L^  defines  a 
length  scale  udiich  is  plausiUy  related  to  eddy  length  scales.  In  vrhat  follows,  we  treat  L  asm 
adjustable  parameter  on  the  order  of  10  km.  The  model  resolution  we  will  use  is  much  coarser 
than  L,  so  we  may  omit  in  (1),  taking  only  ^  =  L'^h 

Ambiguities  arise  also  because  (1)  is  based  upon  quasi-geostrophy  w*  ereas  apfdication  will 
be  made  in  primitive  equation  MOM.  The  range  of  variatitm  of  depth,  expressed  by  h,  should 
be  small  under  quasi-geo^rophy.  In  fact  v;  ;vfli  >ise  the  full  range  of  oceanic  dqNh,  middiig 
=  -  fL^H/Ho  where  Ho  is  a  refemrae  depth.  Under  quasi-geostnphy,  imeipretation  of  is 
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arbitrary;  it  may  describe  either  a  transport  or  velocity  streamfiinction.  If  we  adopt  the  velocity 
streamfiinction  view,  then  0  will  be  converted  to  a  transport  streamfunction  for  incorporation 
into  MOM.  Because  variation  in  V*  is  dominated  by  variation  in  //,  an  approximation  for  the 
maximum  entropy  transit  streamfunction  is  given  by 

=  where  (2) 

If  we  adopt  the  transport  streamfunction  view  of  (1),  we  multiply  through  by  Hg.  and  the 
maximum  entropy  transport  streamfunction  becomes 

=  -fLjH  where  I?  =  -  (3) 

a 

Ambiguities  such  as  the  different  functional  dependences  in  (2)  or  (3)  may  seem  unnerving. 
There  should  be  no  pretense  to  sophistication  here.  Simply,  our  aim  is  to  use  MOM  to  explore 
sensitivity,  comparing  difierences  under  (2)  or  (3)  with  the  results  from  traditional  subgridscale 
relaxation  to  rest  ($*  =  constant).  Length  scales  and  Lt  are  treated  as  adjustable,  with  Ho 
absorbed  into  Moreover,  one  may  consider  that  these  length  scales  exhibit  some  weak 
spatial  dependence.  In  particular,  we  will  consider  that  or  Lt  vary  with  latitude.  Clearly  this 
invites  parameter  tuning.  At  present,  our  aim  is  only  to  observe  sensitivity  to  such  issues. 

Finally,  the  transport  stream  function  ($)  calculated  by  MOM  is  replaced  at  each 
time  step  with 

♦  +  -  ♦)  (4) 

using  either  (2)  or  (3)  for  $*.  The  model  velocity  time  step  is  of  order  1  hour  and  r  is  an 
adjustable  relaxation  time  of  order  25  days. 

MODEL  SETUP 

A  coarse-resolution  global  model  was  created  with  grid  spacing  3.75°  in  longitude  and 
3.711°  in  latitude.  This  resolution  closely  approximates  the  spectral  T32  grid  used  by  the 
global  atmospheric  general  circulation  model  of  the  Canadian  Oimate  Centre.  Fifteen  levels 
were  used  with  layer  thicknesses  ranging  from  20  to  870  m.  Topography  was  extracted  from 
ET0P05  (1986)  using  a  raised  cosine  wei^ted  average  of  the  data  within  a  grid  cell.  Four 
islands  were  included;  Madagascar,  Australia,  New  Zealand  and  Antarctica. 

The  model  was  forced  with  annual  mean  Hellerman  and  Rosenstein  (1983)  windstress  and 
a  50  day  relaxation  of  surface  salinity  and  temperature  to  annual  mean  Levitus  (1982).  The 
domain  was  limited  at  69°  North  to  avoid  high  grid  latitudes,  and  salinity  and  temperature 
were  relaxed  to  Levitus  values  on  the  artificial  northern  boundary  with  a  time  scale  of  3  years. 
Horizontal  viscosity,  horizontal  diffusion,  vertical  viscosity  and  vertical  difihision  were  set  to 
2x10*  m^  s~‘,  4x10^  m^  s~*,  2xl0~*  m^  s~‘  and  1x10"^  m^  s“'  respectively.  Velocity 
time  steps  were  1  hour  and  the  tracer  time  step  was  2  days. 

In  exploratory  integrations,  we  have  allowed  relaxation  time  r  to  vary  from  10  to  200  days, 
and  length  scale  Lt  to  vary  from  10  to  30  km.  Results  overall  were  as  expected  —  the 
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topographic  stress  parameterization  caused  larger  model  responses  in  cases  with  larger  length 
scale  or  shorter  relaxation  times.  These  integratitHis  also  demonstrated  that  without  a  latitude 
depe^.  Jence,  tt^graphic  stress  tendencies  were  relatively  too  strong  at  high  latitudes.  For  longer 
integrations,  we  have  assigned  a  latitude  deperalance  given  by  1  -  0.9  sin  (latitude).  We  have 
chosen  relaxation  time  r  to  be  25  days  and  the  equatorial  value  of  length  scale  X(  to  be  22  km. 

To  compare  the  two  implemenutions  of  topographic  stress,  relative  values  of  the  length 
parameters  Lt  and  must  be  assigned.  Equating  the  topographic  stress  solutions  given  by  (2) 
and  (3),  length  parameters  are  related  by  L„H^  =  L]H.  Within  the  model,  H  varies  discretely 
from  0  to  5.5  km;  we  equate  both  solutions  at  5.5  km.  Thus  given  a  choice  of  22  km  for  Lt,  an 
equivalent  is  88  km.  Although  the  extreme  values  of  the  two  implementations  are  equivalent, 
the  barotropic  velocities  will  be  different.  A  topographic  stress  which  is  proportiotud  to  L^H,  as 
in  (2),  will  tend  to  produce  relatively  stronger  barotix^ic  velocities  in  shallow  water  compared 
to  a  topographic  stress  which  is  proportional  to  Ufi,  as  in  (3).  Maximum  entn^y  (equilibrium) 
velocities  corresponding  to  (2)  and  (3)  are  shown  in  Figure  1 . 

Integrations  were  carried  out  in  parallel  —  two  with  relaxation  to  the  equilibrium  solutions 
described  by  (2)  and  (3),  and  a  third  with  relaxation  to  zero  (control).  To  keep  the  runs  as 
similar  as  possible,  the  control  case  includes  relaxation  to  zero  streamfunction.  This  takes  into 
account  that  topographic  stress  velocities  are  small  compared  with  direct  forced,  upper  ocean 
flows.  Thus  topographic  stress  is  closer  to  relaxing  streamfunction  to  zero  than  to  no  relaxation. 
As  well,  when  including  stream  function  relaxation,  less  explicit  viscosity  is  required,  so  the 
model’s  total  transports  are  largely  unchanged.  With  respect  to  the  damping  processes,  we  try 
to  keep  the  control  case  as  similar  as  possible  to  the  topographic  stress  cases. 

Integrations  were  started  from  horizontally  averaged  Levitus  data.  Interior  relaxation  to 
Levitus  temperature  and  salinity  was  continued  for  10  years  with  a  relaxation  time  scale  of 
1  year.  This  method  of  start-up  avoids  shocking  the  nodel  with  observed  data  that  is  incompatible 
with  the  model  physics  (Semtner  and  Chervin  1988).  The  model  was  then  released  from  any 
interior  relaxation  and  integrated  for  another  190  years.  Although  the  model  will  not  have 
reached  equilibrium  after  200  years,  comparisons  of  trends  can  be  made  between  parallel  runs. 

RESULTS 

From  Figure  1  we  can  anticipate  some  of  the  effects  of  topographic  stress.  Equilibrium 
velocities  are  poleward  on  the  eastern,  and  equatorward  on  the  western,  slopes  of  basins.  These 
velocities  suggest  currents  which  are  opposite  to  many  of  the  well  known  surface  currents  such 
as  the  Gulf  Stream,  Canary,  Brazil  and  Benguela  Currents  or  the  Kuroshio,  California,  East 
Australia  and  Peru  Currents.  Magnitudes  of  statistical  dynamical  velocities  are  small,  however, 
compared  to  the  magnitude  of  the  wind-driven  surface  currents,  so  we  e>q)ect  that  topographic 
stress  will  have  little  effect  on  the  surface  circulation.  At  greater  depths,  where  velocities  from 
directly  forced  flows  have  smaller  magnitude,  tendency  toward  higher  entropy  has  relatively 
greater  effect.  One  anticipates  the  development  or  strengthening  of  poleward  undercurrents 
along  eastern  boundaries  and  equatorward  undercurrents  along  western  boundaries. 
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5  cnvs  Longitude 


Figure  1.  (a)  Maximum  entropy  (equilibrium)  velocities:  (a)  for  the  L?H  case  and  (b) 
for  the  LH^  case. 

Velocities  for  the  first  few  levels  are  dominated  by  the  wind  and  buoyancy  driven  surface 
circulation.  Since  plots  from  the  three  integrations  are  visually  indistinguishable,  only  one 
plot  at  35  m  (level  2)  is  shown  in  Figure  2.  Total  transpoits  are  also  very  similar  for  all  three 
runs  since  much  of  the  transport  occurs  in  the  upper  ocean.  Small  differences  in  transport 
are  noticeable  along  continental  margins,  but  these  differences  are  more  clearly  seen  when 
comparing  velocity  fields.  Results  from  the  model  integrations  at  greater  depths  will  be 
discussed  for  three  geographic  areas:  the  North  Pacific,  the  Mid-Atlantic  and  the  Indian  Oceans. 

We  will  show  results  at  two  depth  levels:  850  m  (level  8)  and  2750  m  Oevel  12).  Shallower 
levels  are  dominated  by  direct  forcing.  With  the  choice  of  parameters  used  here,  the  competition 
between  direct  forcing  and  statistical  dynamics  is  such  that  the  statistical  dynamical  tendencies 
become  apparent  in  the  lower  main  thermocline,  roughly  850  m.  At  greater  depths,  statistical 
dynamical  tendencies  become  more  dominant 
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Figure  2.  Velocities  at  3S  m  for  the  control  case  (results  iiKluding  topographic  stress  are 
visually  identical). 


North  Pacific 

Velocities  at  850  m  for  the  three  implementations  are  shown  in  Figure  3.  Differences 
between  the  control  run  (Figure  3a)  and  the  topographic  stress  runs  (Flguies  3b  and  3c)  can 
be  seen  along  the  continental  margins.  The  control  run  exhibits  no  California  undercurrent 
and  only  a  weak  Oyashio.  The  control  also  has  a  strong  Kuroshio  extension  cutting  off 
a  weak  Alaska  Stream. 

Observational  evidence  for  an  undercurreitt  along  the  Wsst  coast  of  North  America 
is  extensive,  including  work  by  Hickey  (1979)  off  the  coast  of  Southern  Washington, 
Freeland  et  al.  (1984)  off  Vancouver  Island  and  Chelttm  (1984)  off  California.  Measurements 
indicate  poleward  flow  up  to  15  cm  s~'.  often  with  a  width  greater  than  100  km,  usually  with  a 
maximum  at  depths  less  than  700  m,  but  extending  to  more  than  1000  m.  Most  of  these  studies 
were  coastal  in  nature,  thus  the  width  and  depth  of  the  underflow  has  not  been  well  established. 
The  temporal  and  spatial  persistence  of  the  undercurrent  is  also  iK>t  well  krwwn. 

Observations  by  Warrcn  and  Owens  (1988)  indicate  a  deep  Alaska  Stream  flowing 
westward,  with  mean  velocities  between  I  arxi  3  cm  s~’,  along  the  northern  side  of  the 
Aleutian  Trench.  They  also  repon  evidence  for  a  deep,  eastward  jet  which  flows  parallel  to 
the  trench,  south  of  the  Alaska  Stream. 

Figure  4  shows  velocities  at  2750  m  for  the  three  integrations.  Coastal  currents  induced  or 
strengthened  by  topographic  stress  at  850  m  are  seen  more  clearly  at  2750  m,  with  a  western 
boundary  undercurrent  now  extending  to  the  equator.  Smaller  differences  between  the  control 
run  and  the  topographic  stress  runs  can  be  seen  in  the  central  Pacific. 

Direct  observatitms  of  deep  western  boundary  currents  in  the  North  Pacific  are  few.  Indirect 
inference  from  tracers  such  as  silica  (Tally  and  Joyce  1992)  suggest  rwrthward  deep  flow  along 
the  western  boundary.  Currem  meter  observations  by  Fukasawa  et  al.  (1986)  in  the  Shikdtu 
Basin  south  of  Japan  (west  of  the  region  considered  by  Talley  and  Joyce  1992)  show  deep 
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mean  currents  of  5  to  10  cm  s~*  toward  the  south-west  (parallel  to  local  isobaths).  Northward 
flow  of  low  silica  water  is  contrary  to  that  indicated  by  Figure  4,  whereas  the  current  meter 
observations  are  consistent  with  topographic  stress. 


Figure  3.  Velocities  at  850  m  in  the  North 
Pacifle  Ocean:  (a)  for  the  control,  (b)  for  the 
1}H  and  (c)  for  the  Ufi  cases. 


Figure  4.  Velocities  at  2750  m  in  the  North 
Pacific  Ocean:  (a)  for  the  control,  (b)  for 
the  I}H  and  (c)  for  the  Ufi  cases. 


The  deep,  narrow  trenches  found  in  the  Pacific  are  not  resolved  by  this  model,  but  could 
be  important  in  setting  up  counterflows  such  as  the  one  observed  by  Warren  and  Owens  (1988) 
south  of  the  Alaska  Stream.  At  small  scale  the  baroclinic  influence  should  be  taken  iruo 
account,  however,  the  barotropic  formulations  for  topographic  stress  (formulae  2  or  3)  suggest 
a  tendency  for  opposing  currents  on  opposite  sides  of  a  trench.  One  could  imagine  cyclonic 
shear  above  the  trenches  in  mid-depth  and  abyssal  waters,  supporting  northward  and  eastward 
transport  of  tracers  in  the  western  Pacific  while  current  meters  on  the  inshore  side  of  trenches 
show  southward  or  westward  flow. 

Differences  between  the  two  topographic  stress  runs  are  subtle.  The  effects  (when  compared 
to  the  control  run)  produced  by  the  L^H  implementation  (Rguies  3b  and  4b)  tend  to  be  stronger 
along  the  coast  and  slightly  weaker  in  the  central  Pacific  than  with  the  LH^  implementation 
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(Figures  3c  and  4c).  Since  plots  of  the  two  implementaticMis  of  tt^graphic  stress  are  so  similar, 
only  plots  from  the  control  mn  and  the  L?H  nin  will  be  shown  for  the  other  regions. 

Mid-Atlantic 

Model  velocities  at  850  m  are  shewn  in  Figure  5.  Small  differences  between  the  control 
run  (Figure  Sa)  and  the  topographic  stress  run  (Figure  Sb)  can  be  seen  along  the  continental 
margins.  Topographic  stress  has  weakened  or  reversed  the  ccmtrol  runs  equatorward  eastern 
boundary  currents.  The  northward  flow  along  the  western  margin  is  also  weaker  in  the  North 
Atlantic  and  stronger  in  the  South  Atlantic  for  the  topographic  stress  run  than  for  the  control. 


— 270  300  330  0  - ^  270  300  330  0 

3  cm/s  Longitude  l  cm/s  Longitude 


Figure  5.  Velocities  at  850  m  in  the  Rgure  6.  Velocities  at  2750  m  in  the 

Mid-Atlantic  Ocean:  (a)  for  the  control  and  Mid-Atlantic  Ocean:  (a)  for  the  control 
(b)  for  the  L?H  cases.  and  (b)  for  the  L?H  cases. 

Differences  between  the  control  run  and  tire  topographic  stress  run  are  more  obvious  at 
2750  m  (Figure  6)  than  at  850  m.  The  control  run  does  not  develop  any  poleward  eastern 
boundary  undercurrents.  The  deep,  southward  flowing  western  boundary  currertts  are  also 
stronger  in  the  North  Atlantic,  and  weaker  in  the  South  .Atlantic  for  the  topographic  stress 
run  compared  to  the  control  run. 

Poleward  undercurrents  have  been  observed  off  the  west  coast  of  South  Aflrica  (Nelson  1989) 
and  off  the  coast  of  North  Africa  (Mittclstaedt  1989).  Poleward  flow  has  also  been  described 
off  the  Iberian  Peninsula  (Barton  1989).  Although  spatial  and  temporal  knowledge  of  poleward 
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undercurrents  along  the  Eastern  Atlantic  is  limited,  direct  measuremertts  indicate  a  flow  of 
up  to  10  cm  s~'  with  a  width  of  30  to  100  km,  often  with  a  maximum  at  about  300  m, 
but  extending  to  great  depths. 

The  Peni-Qiilean  undercurrent  is  also  present  in  Figure  6b.  Observations  of  the 
Peru-Chilean  undercurrent  are  summarized  by  Fonseca  (1989).  This  current  has  been  seen 
from  the  surface  to  below  300  m. 

While  topographic  stress  may  be  a  dominant  force  in  the  deep  western  boundary  currents 
of  the  North  Pacifle,  thermohaline  forcing  is  clearly  important  in  the  Atlantic.  Because  the 
model  domain  is  truncated  at  69*’  North,  the  thermohaline  forcing  has  in  part  been  provided 
by  relaxation  toward  mean  Levitus  at  all  depths  along  the  artificial  northern  boundary.  To 
test  the  sensitivity  of  the  model  to  this  northern  boundary  condition,  two  further  integrations 
were  performed,  for  the  control  case  and  for  the  l?H  case,  without  interior  relaxation  on  the 
boundary.  Velocities  at  2750  m  are  shown  in  Figure  7.  Without  relaxation,  the  western  boundary 
current  is  largely  absent  in  the  control  case  (Figure  7),  but  remains  present  in  the  topographic 
stress  case  (Figure  7b). 


Figure  7.  Velocities  at  2750  m  in  the  Mid-Atlantic  Ocean  without  relaxation  to  Levitus  along 
the  northern  boundary:  (a)  for  the  control  and  (b)  for  the  l}H  cases. 


It  is  clear  that  when  the  nature  of  imposed  forcing  h^pens  to  agree  with  a  maximum 
entropy  configuration,  topographic  stress  may  not  play  a  very  significant  role.  The  stress 
depends  upon  how  far  forced-dissipative  flows  are  from  ideal  maximum  entropy.  A  second 
observation  concerns  the  climatic  implications  of  these  results.  One  may  speculate  that  variation 
in  thermohaline  forcing  of  the  North  Atlantic  <x)uld  lead  to  abrupt  alteration  of  the  pattern  of 
deep  circulation.  Comparison  of  Figures  6a  and  7a  indeed  appears  to  support  this  speculation. 
However,  when  topographic  stress  is  included.  Figures  6b  and  7b  show  a  deep  circulation 
which  is  rather  insensitive  to  changes  in  thermcrfialine  forcing.  The  indication  is  that  inclusion 
or  omission  of  topographic  stress  in  coupled  ocean-atmosphere  climate  models  could  have 
significant  effect  on  the  overall  sensitivity  of  the  coupled  system. 
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Indian 

Model  velocities  at  850  m  are  shown  in  Figure  8.  Effects  of  topographic  stress  include  a 
slight  weakening  of  the  West  Australian  and  the  Agulhas  Currents,  and  a  slight  strengthening 
of  both  the  undercurrent  off  the  east  coast  of  Australia  and  the  cyclonic  circulations  in  the 
Northern  Indian  Ocean. 


Figure  8.  Velocities  at  850  m  in  the  Indian 
Ocean:  (a)  for  the  control  and  (b)  for  the 
L^H  cases. 


Figure  9.  Velocities  at  2750  m  in  the 
Indian  Ocean:  (a)  for  the  control  and  (b) 
for  the  L?H  cases. 


Figure  9  shows  velocities  for  the  two  integrations  at  2750  m.  Differences  between  the  two 
runs  are  more  pronounced  than  at  850  m.  Topograf^ic  stress  induced  effects  include:  an  increase 
in  the  transport  of  deep  Antarctic  water  northward  into  the  equatorial  Indian;  a  strengthened 
undercurrent  along  the  south,  west  and  east  coasts  of  Australia;  a  strengthening  of  the  circulation 
in  the  North  Indian;  and  a  slight  weakening  of  tire  Grcumpolar  Current  irear  the  coast  of 
Antarctica  leading  to  the  development  of  a  countercurrent,  west  of  the  Kerguelen  Plateau. 

An  undercurrent  beneath  the  Leeuwin  current  has  been  observed  to  be  equatorward 
(Church  et  al.  1989),  contrary  to  the  topogrsqrhic  stress  tendency.  Model  results  off  the 
west  coast  of  Australia  demonstrate  the  competition  between  direct  forcing  and  topographic 
stress.  Figure  2  suggests  a  weak,  poleward  current  which  continues  down  to  about  200  m, 
after  which  an  equatorward  undercurrent  is  present  until  about  1000  m  (Hgure  8).  Although 
topographic  stress  opposes  an  equatorward  undercurrent,  other  forces  are  overriding  the 
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maximum  entropy  tendency.  Between  1000  and  2000  m,  both  integrations  generally  show 
poleward  flow.  Differences  between  the  two  runs  are  most  apparent  between  2000  and  3500  m 
where  slow,  mixed  flow  is  seen  for  the  control  run  while  a  steady  poleward  flow  is  seen  for 
the  topographic  stress  run  (Figure  9).  Below  3S(X)  m  the  flow  is  again  equatorward  for  both 
integrations,  although  stronger  for  the  control  run. 

SUMMARY 

The  GFDL  Modular  Ocean  Model  was  used  to  explore  proposed  representations  of 
topographic  stress  for  large  scale  ocean  modelling.  We  have  sought  to  characterize  the  effect 
of  unresolved  eddy-topography  interaction  (topographic  stress)  in  terms  of  a  tendency  for  large 
scale  flow  to  evolve  toward  a  state  of  higher  system  entropy. 

Two  implementations  of  topographic  stress  were  tested,  differing  in  the  assumed  functional 
dependence  of  streamfunction  upon  depth.  The  two  topographic  stress  cases  perform  similarly, 
but  they  differ  from  the  control  case  in  several  respects.  The  most  apparent  effects  of 
topographic  stress  are  seen  along  continental  boundaries,  particularly  in  the  development  or 
strengthening  of  undercurrents.  Integrations  which  include  topographic  stress  produce  many  of 
the  observed  poleward  eastern  boundary  undercurrents  which  the  control  run  does  not.  Along 
western  boundaries,  the  topographic  stress  tendency  is  equatorward.  Differences  from  the 
control  nm  arc  seen  more  clearly  in  the  western  Pacific  than  in  the  western  Atlantic,  since  the 
Atlantic  is  also  responding  to  stronger  theimohaline  forcing. 
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Physical  oceanographers  deal  with  randomness  and  uncertainties  when  analyzing  ocean 
data  or  formulating  ocean  models.  Concepts  and  results  from  probability  theory,  statistical 
inference,  and  stochastic  processes  are  appli^;  space-time  averages  are  interpreted  as 
ensemble  averages,  variances  and  spectra  estimated,  and  stochastic  terms  added  to 
dynamical  equations.  Special  aspects  arise  when  the  huge  amount  of  real  or  model  ocean 
data  and  the  complexities  of  ocean  physics  are  considered.  Efficient  ciata  representation 
and  analysis  algorithms  are  sought,  and  idealized  dynamics  that  incorporate  statistical  and 
chaotic  tendencies  are  explored.  Progress  on  such  statistical  methods  was  discussed  at  the 
seventh  ‘Aha  Huliko‘a  Hawaiian  Winter  Workshop,  held  January  12-15,  1993  in 
Honolulu.  Specifically,  the  participants  considered  the  variety  of  oceanographic 
observations,  methods  for  efficient  flow  and  data  representation,  ffequentist  versus 
Bayesian  inference,  data  assimilation,  and  idealized  dynamics.  The  size  and  complexity  of 
oceanographic  problems  offen  prevent  the  application  of  standard  methods  and  physical 
oceanographers  are  faced  with  the  task  of  inventing  methods  that  deal  with  the 
peculiarities  of  their  problems  in  a  sensible  manner.  These  special  methods  are  discussed  in 
more  detail  in  this  article.  Names  in  parentheses  refer  to  the  authors  of  lectures  given  at 
the  meeting  and  chapters  published  in  the  proceedings. 

OCEANIC  OBSERVATIONS 

Oceanographic  phenomena  cover  many  space  and  time  scales  and  the  questions  asked, 
probability  assumptions  made,  and  inferences  drawn  differ  widely.  A  few  examples  were 
considered. 

The  smallest  scales  of  variability  are  important  for  mixing.  One  of  the  oldest  unsolved 
problems  is  how  the  strength  of  ocean  mixing  varies  with  location  and  over  time. 
Traditional  measurement  methods  have  depended  upon  making  estimates  of  dissipation 
rates  from  direct  observations  at  scales  of  centimeters.  Fluxes  are  inferred  from 
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dissipation,  after  further  uncertain  assumptions.  As  instruments  for  direct  observation  of 
the  fluxes  become  available,  the  relation  of  flux  to  dissipation  should  become  clearer 
However,  the  difficulty  remains  that  direct  observation  at  centimeter  scales  is  a 
time-consuming  (hence  expensive)  operation,  limiting  its  applicability  for  larger  scale 
ocean  surveying.  A  possibility  is  that  larger  scale  observation  of  vertical  velocity  by  a 
modified  acoustic  Doppler  current  profiler  may  provide  a  basis  for  estimation  of 
dissipation  and  mixing  while  enabling  rapid  surveying  (A.  Gargett).  The  value  of  this 
proposed  method  will  depend  upon  extensive  intercomparison  with  more  traditional 
measurements,  using  statistics  to  calibrate,  and  to  estimate  the  confidence  of,  inferences 
from  acoustic  Doppler  surveys. 

Over  vertical  scales  from  meters  to  tens  of  meters,  it  is  natural  to  describe  the  variability  in 
the  ocean  interior  in  terms  of  vertical  displacements  of  isopycnals.  The  frequency  of 
chosen  isopycnals  in  a  given  vertical  bin  size  may  be  fitted  against  a  Poisson  distribution 
while  the  separation  between  isopycnals  may  approximate  a  gamma  distribution, 
constrained  to  have  unit  mean  (R  Pinkel).  The  mean,  variance,  and  skewness  at  many 
scales  is  thus  described  by  a  single  (dimensional)  parameter,  whose  physical  significance 
remains  elusive. 

On  larger  scales,  underway  acoustic  Doppler  profiling  together  with  accurate  global 
positioning  has  made  possible  the  surveying  of  upper  ocean  currents  along  ship  tracks  (E. 
Firing).  Length  scales  of  velocity  variations,  both  in  the  horizontal  and  vertical,  are 
observed  to  change  with  latitude  and  depth,  including  strong  signatures  when  crossing  the 
equator.  No  comprehensive  statistical  descriptions  of  the  variability  and  its  changes  have 
been  established. 

Satellite  altimeter  data  have  allowed  the  mapping  of  sea  level  variance  (D.  Chelton).  A 
close  relation  between  the  variance  of  transient  eddies  and  the  intensity  of  mean  flows  and 
the  bathymetry  is  found.  Wavenumber  spectra  show  a  break  of  slope  near  the  first  internal 
Rossby  radius,  then  are  descending  steeply  towards  higher  wavenumbers.  At  the 
crossovers  of  ascending  and  descending  satellite  ground  tracks,  the  two  components  of 
surface  geostrophic  velocity  can  be  determined.  This  has  been  used  to  investigate  the 
anisotropy  of  velocity  variance  and  lateral  transfers  of  momentum  by  Reynolds  stresses  on 
a  dense  global  grid.  Further  inferences  are  complicated  by  the  unique  space-time  sampling 
characteristics  of  satellite  observations.  The  choice  of  orbit  parameters  for  satellite 
missions  can  be  discussed  in  terms  of  these  sampling  characteristics  and  their  implied 
filters,  transfer  functions,  and  biases. 

Interest  in  oceanography  is  often  focussed  on  the  larger  scales,  and  considerable  ingenuity 
goes  into  averaging  local  observations  to  remove  the  “noise”  and  obtain  the  “signal.” 
Noise  can  be  suppressed  by  making  observations  that  are  inherently  of  an  integrating  type, 
such  as  acoustic  thermometry,  reciprocal  tomography,  electric  field  and  bottom  pressure, 
inverted  echo  soundings,  cable  voltages,  polar  motion,  and  length  of  day.  Such  integrating 
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measurements  often  provide  cleaner  signatures  of  underlying  dynamical  processes.  For 
example,  bottom  pressure  and  electric  voltage  are  excellent  proxies  for  the  barotropic  flow 
component  and  are  found  to  be  well  correlated  with  the  atmospheric  windstress  curl, 
sugg''  *ing  that  the  subinertial  barotropic  variability  in  the  ocean  is  atmospherically  forced 
(D.  Lu...er). 

FLOW  REPRESENTATION 

The  proper  choice  of  basis  functions  on  which  to  decompose  a  flow  field  becomes  crucial 
when  one  seeks  to  reduce  the  system  space.  Reduced  bases  should  optimally  resolve  the 
underlying  dynamics,  representing  the  more  significant  flow  patterns.  Wavelet  transforms 
might  be  such  an  optimal  choice.  Traditional  analyses  of  turbulence  have  resorted  either  to 
Fourier  spectra  or  to  grid  point  (Dirac)  treatment,  suggesting  superposition  of  waves  or 
interaction  of  isolated  vortices.  These  are  extreme  views,  emphasizing  either  perfect 
wavenumber  resolution  with  no  spatial  resolution  or  the  converse.  Wavelets  make  a 
compromise,  offering  limited  wavenumber  resolution  with  limited  spatial  resolution, 
suggesting  ‘coherent  structures.’  Applied  to  analyses  from  numerical  two-dimensional 
turbulence  (M.  Farge),  wavelets  were  shown  to  be  efficient  at  retaining  the  full  range  of 
spectral  information  while  permitting  spatially  local  examination  of  regions  dominated  by 
rotation  compared  with  regions  dominated  by  straining. 

Another  generalization  of  the  traditional  Fourier  transform  is  based  on  the  solutions  of 
exactly  integrable  nonlinear  wave  equations.  The  method,  known  as  inverse  scattering 
transform,  decomposes  a  time  series  into  a  superposition  of  nonlinear  oscillation  modes, 
which  include  ordinary  sine  waves,  cnoidal  waves,  solitary  waves,  and  other  special  wave 
forms  (A.  Osborne).  Applications  of  the  method,  based  on  the  solutions  of  the  Korteweg- 
de- Vries  equation,  have  been  efficient  and  insightful  in  the  analysis  of  field  and  lab  data. 

Bases  for  representing  actual  data  are  often  chosen  in  an  empirical  way  by  methods  known 
as  ‘empirical  orthogonal  functions’  (EOFs),  ‘principal  component  analyses’  (PCAs),  or 
‘factor  analyses.  ’  Eigenvalues  from  the  data  covariance  matrix  permit  ranking  the 
empirical  eigenvectors  according  to  what  fraction  of  total  data  variance  is  represented  by 
each  eigenvector.  This  allows  retention  of  a  relatively  few  eigenvectors  in  lieu  of  the  much 
larger  dataset.  Where  data  are  seen  to  be  “clumped”  EOFs  are  rotated  to  more  nearly 
recognize  the  “clumps.”  A  major  open  question  is  the  identification  of  the  significant 
EOFs.  Selection  rules  that  claim  to  perform  a  significance  test  may  lack  a  thorough 
statistical  basis  and  the  often  applied  rules-of-thumb  are  just  what  they  claim  to  be — 
rules-of-thumb.  A  promiL^  ng  approach  is  the  testing  against  artificial  data  (G.  Mitchum). 
Data  generated  by  a  red  noise  process  are  particularly  relevant  to  oceanographic  fields 
such  as  sea  level. 

Time  series  from  coefficients  of  EOFs  may  be  subject  to  further  analyses.  By  best  fitting 
such  time  series  to  a  first  order  vector  Markov  process,  one  identifies  linear  combinations 
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of  EOFs  that  exhibit  oscillation  or  propagation,  termed  ‘principal  oscillation  patterns’ 
(POPs)  (H.  von  Storch).  A  POP  analysis  can  be  performed  both  on  actual  data  and  on  the 
output  from  large  numerical  models.  It  can  be  a  tool  in  identifying  linear  subsystems  when 
these  linear  subsystems  control  a  significant  portion  of  the  variability. 

FREQUENTIST  VERSUS  BAYESIAN  INFERENCE 

Statistical  inferences  depend  on  the  probability  assumptions  made.  One  interpretation  of 
probability  is  from  a  ‘frequentist’  viewpoint.  If  an  experiment  can  be  regarded  as  being 
repeated  many  times,  previous  outcomes  may  be  collected  to  estimate  the  probability  for 
subsequent  outcomes.  In  contrast,  a  Bayesian  approach  may  be  taken  in  a  case  where  only 
one  experiment  is  possible.  Then  one  expresses  one’s  prior  beliefs  concerning  uncertain 
model  parameters  as  a  probability  distribution,  observes  data,  then  computes  a  posterior 
distribution  about  those  parameters.  The  frequentist’ s  method  assumes  that  the  process 
and  all  its  parameter  values  remain  constant  over  the  (often  imagined)  series  of 
experiments.  The  Bayesian  approach  has  to  assume  a  prior  distribution.  Both 
methodologies  have  their  advantages,  and  the  appropriate  statistical  approach  depends  on 
the  type  of  problem  considered  and  the  type  of  inference  desired  (G.  Casella). 

The  Bayesian  approach  emphasizes  a  clear  statement  of  prior  beliefs.  This  may  be  valuable 
in  bringing  out  ‘hidden’  assumptions  (J.  Kadane).  How  one  formulates  a  null  hypothesis 
can  be  crucial.  If  one  formulates  the  null  hypothesis  as  a  ‘sharp’  statement,  then  collecting 
sufficient  data  will  lead  to  rejection,  unless  the  hypothesis  is  exactly  true.  Another  danger 
is  that  formulating  a  null  hypothesis  after  observing  some  data,  for  example  in  the  case  of 
Earth’s  climate  record,  may  confuse  the  testing  of  such  a  hypothesis  (H.  von  Storch). 
Bayesian  methods  have  successfully  been  applied  to  the  quality  control  of  data  used  for 
assimilation  in  numerical  weather  prediction  models  where  the  prior  knowledge  about 
error  distribution  and  background  fields  need  to  be  properly  weighted  (A.  Lorenc). 

DATA  ASSIMILATION 

Combining  data  with  models  serves  various  purposes.  The  model  may  help  to  complete  a 
data  set,  ‘dynamically  interpolating’  to  fill  data  gaps.  An  example  is  the  reconstruction  of 
Gulf  Stream  paths  (M.  Chin). 

One  of  the  ways  that  oceanography  may  differ  from  weather  forecasting  is  that  there  is 
less  emphasis  upon  ocean  ‘forecasting.’  To  a  considerable  extent,  the  role  of  data 
assimilation  in  the  ocean  may  be  more  as  a  means  to  obtain  information  about  uncertain 
parameters  in  the  ocean  physics.  Among  the  more  rigorous  methods,  oceanographers 
employ  adjoint  methods,  integrating  backward  in  time  to  obtain  the  gradient  of  the  cost 
function.  An  example  is  the  estimation  of  vertical  eddy  viscosity  and  surface  drag 
coefficient  from  data  in  a  wind  forced  Ekman  layer  (J.  O’Brien).  For  more  complex 
systems,  simpler  algorithms  are  being  developed  that  seek  to  reduce  a  chosen  cost 
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function  toward  a  smaller  value  which  nonetheless  might  not  be  a  minimum  This  can  be 
seen  in  a  time-dependent,  three-dimensional  circulation  model  of  the  North  Atlantic, 
including  an  embedded  mixed  layer  with  uncertain  parameters  (J.  Schroter).  Repeated 
forward  runs  of  the  model  are  made  using  different  choices  of  parameters,  with  outcome 
evaluated  by  its  cost  function. 

Model  errors,  such  as  caused  by  imprecisely  known  physics  or  forcing  fields,  often 
represent  a  major  uncertainty  in  the  problem  and  need  to  be  accounted  for  in  the  cost 
function.  This  is  the  case  for  the  heat  flux  uncertainties  in  a  model  of  the  tropical  sea 
surface  temperature  (C.  Frankignoul).  Since  the  heat  flux  uncertainties  have  poorly  known 
correlation  scales,  an  adaptive  method  is  applied  where  the  model  being  tuned  is  also  used 
to  determine  the  uncertainties  in  the  heat  flux  field.  Parameters  are  found  that  reduce  the 
warm  sea  surface  temperature  bias  and  that  might  eliminate  the  need  for  a  flux-correction 
term  in  climate  studies. 

There  are  various  methods  to  seek  extrema  of  a  function  The  adjoint  equation  technique 
yields  estimates  of  the  gradient  of  the  cost  fiinction  about  a  state  of  model  variables, 
permitting  use  of  some  descent  algorithm  to  seek  a  minimum  cost.  However,  when  the 
cost  function  has  a  complicated  dependence  on  model  variables,  there  is  danger  that 
descent  algorithms  may  not  converge  or  may  locate  only  a  local,  rather  than  a  global, 
minimum.  Two  alternatives  were  described  (N.  Frazer).  Simulated  annealing  (by  analogy 
to  cooling  from  a  melt),  employs  random  perturbations  of  parameters  while  seeking  to 
reduce  a  cost  function  (characterized  as  a  Gibb’s  free  energy,  U).  When  a  particular 
random  perturbation  reduces  U,  that  perturbation  is  accepted  and  a  subsequent 
perturbation  is  applied.  When  a  perturbation  increases  U,  the  perturbation  may  be 
accepted  with  probability  given  by  a  Gibb’s  distribution  exp(-t//7),  where  parameter  Thas 
the  role  of  temperature.  The  ‘art’  is  to  reduce  T  according  to  a  ‘cooling  schedule’  which  is 
efficient  yet  avoids  ‘freezing’  into  a  local  minimum  of  U.  The  second  alternative,  termed 
genetic  algorithms,  bears  similarity  to  simulated  annealing.  A  population  of  possible 
choices  of  sets  of  parameters  is  generated.  Members  are  paired,  and  randomly  chosen 
portions  of  the  parameter  set  are  exchanged,  including  some  ‘mutations.  ’  Members  are 
evaluated  by  a  ‘fitness’  function  (which  might  be  the  previous  Gibb’s  distribution)  to 
determine  probable  representation  in  a  next  generation. 

CHAOS 

Concepts  from  chaotic  dynamics  have  been  applied  to  account  for  fluid  behavior  that 
appears  to  be  random.  Laboratory  experiments  show  aspects  of  how  mixing  comes  about. 
A  theoretical  precept  about  turbulent  mixing  is  that  line  elements  advected  with  a  fluid 
should  tend  to  grow  exponentially  in  length.  It  is  argued  that  this  is  impossible  in  a 
two-dimensional  steady  flow.  Beginning  from  this  simple  case,  what  further  dynamics  are 
needed  before  exponential  line  stretching  occurs?  Experiments  (J.  Ottino)  show  that  a 
simple  (periodic)  time  dependence  is  already  sufficient  to  introduce  chaotic  regions. 
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yielding  exponential  line  growth.  Low  order  fixed  points  of  the  time-periodic  mapping 
dominate  the  regions  of  chaotic  mixing,  where  it  is  seen  that  hyperbolic  fixed  points  are 
surrounded  by  elliptic  islands.  Unmixed  regions  stretch  and  contract,  but  form  coherent 
islands  in  the  midst  of  chaos.  In  three  dimensional  systems  these  regions  might  appear  as 
tubes.  These  results  have  implications  regarding  the  character  of  velocity  fields  inferred  via 
flow  visualization. 

The  study  of  chaotic  dynamics  may  contribute  to  an  understanding  of  several  problems  in 
ocean  dynamics.  If  a  system  is  chaotic,  perturbation  expansions  are  misleading  and 
predictability  is  limited.  It  has  been  suggested  that  the  El  Niho/Southem  Oscillation  system 
can  be  modeled  as  a  low  order  chaotic  system.  A  phase-space  reconstruction  analysis  (M. 
Brown)  performed  using  a  measured  time  series  of  eastern  tropical  Pacific  sea  surface 
temperature  provides  some  support  for  this  hypothesis  “Spaghetti  plots”  of  float 
trajectories  often  suggest  chaotic  behavior.  However,  floats  that  have  been  seeded  in  an 
isolated  eddy  may  undergo  many  rotations  of  the  eddy  as  it  translates  without  dispersing. 

It  is  suggested  that  such  coherent  eddies  are  essential  to  the  observation  of  anomalous 
diffusion  (rates  of  particle  separation  greater  than  would  result  from  random  walks)  On 
the  theoretical  side,  WKB  ray  paths  of  surface  waves  propagating  through  an  ocean  of 
periodically  varying  depth  exhibit  chaotic  behavior  when  the  amplitude  of  the  topographic 
variation  exceeds  a  critical  value. 

A  major  problem  in  the  application  of  ideas  relating  to  chaos  is  the  determination  of 
whether  an  ocean  phenomenon  is  chaotic.  Estimates  of  measures  of  chaos  (such  as 
dimension  or  the  Lyapunov  exponent)  from  oceanographic  data  require  an  enormous 
number  of  degrees  of  freedom  for  any  reasonable  degree  of  confidence  and  are  often 
counter-intuitive.  For  example,  filtering  for  the  purpose  of  suppressing  noise  can 
sometimes  be  seen  to  increase  the  Lyapunov  exponent  (E.  Carter).  A  related  issue  is  the 
characterization  of  the  roughness  of  seafloor  topography.  Spectra  and  fractal  dimensions 
have  been  used.  A  novel  approach  considers  the  ‘geometric  temperature’  of  any  curve, 
calculated  from  the  number  of  intersections  of  the  test  curve  with  randomly  selected 
straight  lines  (W.  Woyczynzki).  From  geometric  temperature,  one  may  proceed  to  an 
analogous  thermodynamics  of  curves  and  surfaces. 

STATISTICAL  DYNAMICS 

The  evolving  statistics  of  flows  have  long  been  an  object  of  turbulence  theory.  Renor¬ 
malization  group  (RG)  techniques  have  been  applied  to  beta-plane  turbulence,  seeking  an 
efficient  non-eddy-resolving  parameterization  of  small-scale  processes  to  enable  more 
cost-effective  large  scale  modeling.  Both  viscosity  and  beta  (Coriolis  gradient)  have  to  be 
renormalized  or  rescaled.  Such  renormalization  re-interprets  and  quantifies  previously 
obtained  results  and  yields  new  insights.  In  particular,  the  RG-derived  spectral  energy 
transfer  shows  that  beta-plane  turbulence  naturally  tends  to  self-organize  into  zonal-jetlike 
flows  and  westward  propagating  Rossby  waves.  Two-parametric  eddy  viscosity  and  beta 
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coefficients  account  for  the  effect  of  unresolved  turbulence  and  waves  on  resolved  scales 
and  are  suggested  for  use  in  non-eddy-resolving  simulations  of  beta-plane  turbulence  (B. 
Galperin).  A  further  statistical  study  of  beta-plane  turbulence,  utilizing  Lagrangian-based 
second  order  closure,  shows  that  the  westward  phase  speed  of  Rossby  waves  is 
significantly  enhanced  by  random  nonlinear  interaction  (Y.  Kaneda) 

Outcome  of  many  random  interactions  within  a  flow  can  sometimes  be  expressed  in  terms 
of  a  large  scale  statistical  dynamics.  Numerical  experiments  with  inviscid  geostrophic 
dynamics  show  random  initial  conditions  forming  basin-scale  mean  flows.  When  viscosity 
is  included,  regions  of  homogenized  vorticity  form  in  the  basin  interior.  Inclusion  of 
topography  modifies  the  resulting  mean  flows.  Sufficiently  steep  topographic  slopes  will 
cause  jet-like  flows  along  isobaths,  corresponding  to  the  zonal  jets  in  Rossby  wave 
turbulence  (G.  Vallis).  A  tendency  for  random  interactions  to  yield  large  scale  mean  flows 
suggests  a  method  to  parameterize  eddies  by  directly  introducing  the  statistical  dynamical 
tendencies  at  large  scale.  Experiments  with  a  coarsely  resolved  global  ocean  model  show 
that  various  observed  large  scale  flows,  thought  to  be  due  to  eddy  interactions,  can  be 
obtained  by  such  parameterization  (G.  Holloway). 

CONCLUSIONS 

Oceanographic  phenomena  differ  widely  in  the  characteristics  of  data,  the  governing 
physics,  the  underlying  probability  space,  and  the  inferences  sought.  The  statistical 
methods  reflect  this  diversity.  They  range  from  signal  detection,  via  parameter  estimation 
and  data  assimilation,  to  stochastic  or  chaotic  models.  A  common  challenge  in  most  appli¬ 
cations  is  the  huge  amount  of  data  and  the  complexities  of  the  physics.  The  system  space 
or  the  number  of  degrees  of  freedom  must  be  r^uced  to  a  manageable  size.  EOF  and  POP 
analyses  are  used  to  reduce  large  and  complex  data  sets;  wavelet  and  inverse  scattering 
transforms  are  applied  to  represent  efficiently  the  underlying  physics;  idealized  dynamics 
are  employed  to  understand  chaotic  and  statistical  tendencies.  In  applying  these  and  other 
statistical  methods,  physical  oceanographers  must  modify  existing  methods  and  must 
invent  new  ones  to  deal  with  the  peculiarities  of  oceanographic  problems  in  a  sensible 
way.  The  workshop  witnessed  the  considerable  progress  that  is  being  made  at  this  task. 
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